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PGIT in 1960 


1960 should be an interesting year for PGIT. 
Eleven years after the publication of C. E. Shannon’s 
basic work, the first serious attempts to apply 
coding to get reliable transmission of digital data 
over unreliable channels have started in several 
laboratories. 

Interest in finding engineering solutions for the 
problems involved will undoubtedly spread further 
during this year. Sending reliable information around 
the solar system using limited power is one of the 
applications which is causing increased interest. We 
shall probably be seeing reports on progress in the 
implementation of coding and decoding techniques 
in coming issues of these TRANSACTIONS. 

After much discussion, PGIT is introducing two 
new kinds of special issues of TRANSACTIONS. One 
is the monograph—too long for publication as a 
journal article, too specialized for publication as a 
book—which can usually be distributed only in the 
form of a technical report to those lucky enough to 
have heard of it. The first issue of this sort, a well- 
known but unpublished report by J. I. Marcum on 
radar detection problems, is in the mill. So is the 
first number of another type of special issue: a 
series of submitted papers on matched filter tech- 
niques, being organized by Paul E. Green, Jr., 


Vice-Chairman of PGIT and Associate Editor of 
these TRANSACTIONS. Other special issues will be 
coming along, and suggestions along these lines 
should be directed to P. E. Green, Jr. at M.I.T. 
Lincoln Laboratory, Lexington, Mass. 

In addition to publishing Transactions, PGIT 
organizes meetings, or cooperates with others who 
do so. The success of the International Symposium 
on Circuit and Information Theory in Los Angeles 
last June, organized by G. L. Turin, R. A. Epstein, 
and others, has suggested that we should program 
such meetings regularly in the United States every 
two years. Such programming would simplify the 
job of the organizers of the meetings, since there 
would be longer notice and more time to get the 
many necessary jobs done. In the intervening years 
we might cosponsor or even help to organize meetings 
held in other countries. Along these lines we are 
beginning to plan for a meeting on the East Coast 
during the Spring or Fall of 1961. 

I look forward to hearing any suggestions you 
may have for improving the effectiveness of PGIT 
activity in connection with meetings, publications 
or any other services which the group might effec- 
tively perform. 

—Prrer ELIAS 
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A Class of Systematic Codes for Non-Independent Errors” 


N. M. ABRAMSON 


Summary—A class of systematic codes has been obtained which 
will correct all single errors and all double errors which occur in 
adjacent digits. These codes use significantly fewer checking 
digits than codes which correct all double errors. In addition, 
because of inherent regularities in their structure, these codes may 
be instrumented in a strikingly simple fashion. 


INTRODUCTION 


HE occurrence of independent errors in a binary 

communication system may be combatted quite 

effectively by the use of single and multiple error- 
correcting block codes [1]-[4]. In many cases, however 
(e.g., telephone lines with impulse noise and magnetic 
tape with signal “‘drop-out”), errors do not occur in- 
dependently; the probability that the nth binary digit 
will be in error will depend upon whether the (n — 1)th 
binary digit was in error. That is, in many situations, if 
two errors occur in one word, the errors are probably in 
adjacent binary digits. Using a double-error correcting 
code in such situations will increase the reliability of 
transmission at the cost of greatly increasing the number 
of binary digits necessary to transmit each word. Such a 
code clearly does not make use of the fact that double 
errors are likely to be adjacent errors. 

This paper discusses the construction of a class of sys- 
tematic codes’ which corrects all single errors and all 
double errors when these double errors occur in adjacent 
binary digits. These codes require significantly fewer 
parity check digits than codes which correct all double 
errors. Furthermore, because of inherent regularities in 
the structure of these codes, they can be instrumented 
with remarkable simplicity. 

We have organized this paper into three sections. 
Section I presents a set of rules without their derivation, 
which will allow the reader to construct single-error- 
correcting-double-adjacent-error-correcting (SEC-DAEC) 
codes. A simple discussion indicating how the codes might 
be instrumented is also included in Section I. Section II 
contains a detailed example of the construction and use 
of a SEC-DAEC code. Section III contains a discussion 
of the properties of SEC-DAEC codes in general, and 
defines the subclass of SEC-DAEC codes obtained in 
this paper. Section III also contains a derivation of the 
rules presented in Section I, and some miscellaneous 
observations on the properties of the codes. 


* Manuscript received by the PGIT, December 8, 1958. The work 
reported in this paper was supported jointly by the Office of Naval 
Research, Contract Nonr 225-24H, and by IBM Research Labora- 
tory. 

7 Stanford Electronics 
Stanford, Calif. 

1 See sec. 7 of Hamming [1]. 


Laboratories, Stanford University, 
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Srcrion I 


A. The Number of Information Digits 


Because we restrict our attention to systematic block 
codes, we may write any possible word of our code as 


n binary digits 
— srw 
V1Xo cae Lm YiY2 ie Yre 


(1) 


m binary digits k binary digits 


That is, we shall assume that each word is n binary 
digits long. Of these n digits, m digits may be used to 
convey the information; we denote these by 21, ®2, -+* Um: 
The remaining k digits, denoted by y1, Yo, *** Ye, are 
determined by suitable parity checks over the infor- 
mation digits. 

The first property of the codes which must be de- 
termined is the number of parity digits necessary for any 
given number of information digits. Stated another way, 
we might ask for the largest number of information 
digits we may use for a specified number of parity digits, 
k. In Table I, we have listed, for k = 1, 2, --- 10, the 
maximum number of information digits possible for the 
class of SEC-DAEC codes obtained. This upper bound 
is denoted by m*. For purposes of comparison, we have- 
also given in the same table, m**, an upper bound on the 
number of information digits possible for codes which 
correct all double errors. 

We shall present a set of rules for constructing SEC- 
DAEC codes where, for any given number of parity 
digits k, the number of information digits is m*, the 
maximum possible as indicated in Table I. Codes with 
the number of information digits less than m* may be- 
constructed by first obtaining the code for m* information: 
digits in the usual manner, and then setting the first few 
information digits equal to zero. | 


B. The Parity Check Table 


To specify a systematic code completely, it is only 
necessary to specify the digits checked by each of the 
parity digits. or example, in a SHC-DAEC code of total 
block length 7, where we have 3 information digits and 
4 parity digits, we can completely specify a code by 
filling in a parity check table, as shown in Fig. 1. 

This is merely a convenient way of indicating that for 
this code 

y, checks 2, and 2, 
Y2 checks x, and «x, 
Y3 Checks a3 and y,; 
ys checks all the digits. 


959 


TABLE I 


Tae NuMBER OF INFORMATION Diqits 
FOR SEC-DAEC AND SEC-DEC Copzs 
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TABLE II 


Pariry CHECKS FoR y; IN SEC-DAEC Coprs 
(M SEQUENCES) 


k m*  m** 
1 — = 
y, peaes = 
3 0 a 
4 3 il 
5 10 2 
6 25 4 
"i 56 8 
8 119 14 
9 246 22 

10 501 34 


k = number of parity digits. 
m* = upper bound on the number of possible information digits 
for the class of SEC-DAEC codes obtained. 
m** = upper bound on the number of possible information digits 
for ordinary double error correcting codes. 


T %%2 fe Yi Yo Ys Ya 


Y|r/] 2 £ 
Y2 LL Gi 
Y¥3 x x x 


Oi BPN zeal ap all ae || ak eae 


Fig. 1—A parity check table. 


The rules, then, for filling out a parity check table of 

rows (corresponding to the k parity digits) and 
* + k = n columns (corresponding to the m* infor- 
ation digits and the k parity digits) are: 

1) Number the rows y;, y2, -** Yz, aS indicated in the 
example above. 

2) Let y, be a parity check over all the digits. 

3) The parity checks for y, may be determined from 
Table II. For any given k, use the sequence of zeros 
and ones in Table II to obtain the parity checks 
on y, as follows: 

For any k the corresponding sequence in Table 
II is m* + k digits long. If the jth digit in this 
sequence is a zero, y, will check the jth digit 
of the code words; if the jth digit in this sequence 
is a one, y, will not check the jth digit of the 
code words. That is, we need only write the 
sequence obtained from Table IJ in the first 
row of the parity check table we are con- 
structing, using x for a zero and a blank space 
for a one. For example: for k = 4, Table II 
gives the sequence 


OBO ates Oke tee 


and in the parity check table shown in Fig. 1, 
the first row has «x in the first, second, and 
fourth places. 


Digit Digit 
Number k = 4 aa a0 Number k = 6 
1 0 0 0 176 0 
2 0 1 0 18 1 
3 iL 0) 0 19 0 
4 0 ik 1 20 0 
5 il il 1 Pal 1 
6 1 0 0 22 0 
7 1 0 1 23 1 
8 1 1 24 1 
9 0) 1 20 0 
10 0) 0 26 0 
11 0 1 O40 1 
12 1 0 28 1 
13 1 il 29 1 
14 it 0 30 1 
15 il 0 31 1 

16 0 


Fig. 2—A four-stage shift register. 


4) The parity checks in each succeeding row of the 
parity check table (except the last, y,) are then the 
same as the row directly above, except that they are 
shifted to the right by one digit (see Fig. 1). 

These four rules, together with Table II, allow one to 
construct SEC-DAEC codes of block length less than, 
or equal to, 31 digits. It is, of course, possible to con- 
struct SEC-DAEC codes with arbitrary block length. 
This construction is explained directly below. 


C. Shift Registers 


The fact that the codes obtained in this paper may be 
simply instrumented depends directly upon the properties 
of the sequence of zeros and ones giving the parity checks 
for y; (see Rule 3 above). Each of the sequences of length 
2” — 1, listed in Table II, may be derived from an R-stage 
binary shift register. Fig. 2 is a block diagram of a four- 
stage register. (For the most part, we follow the notation 
of Zierler [5].) 

The F; are flip-flops, the circles are switches (S; = 0 
means open, S; = 1 closed), and the small squares are 
mod-2 adders. For a given set of zeros and ones, S,, So, 
S;, and S,, we insert an arbitrary binary number (except 
0000) into the flip-flops. This binary number is then 
shifted to the right every 7’ seconds, while simultaneously 
we feed into F,, as indicated in Fig. 2. The successive 
entries of zeros and ones into F, then form a linear binary 
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shift-register sequence. There is extensive literature on 
the properties and uses of these sequences [4]-[7]. 

For certain values of the S;, the successive entries in 
the flip-flops of an R-stage register will be all of the 
k-digit binary numbers, except the all-zero number. In 
this case, the sequence of zeros and ones in F’; is periodic 
of period 2” — 1, and is called a maximal length linear 
binary shift-register sequence, or m sequence [4]. The 
sequence of zeros and ones used in Rule 3 can always be 
taken to be an m sequence out of ak — 1-stage shift register 
ending in k — 1 ones. 

Values’ for the S; which yield m sequences for R-stage 
registers, R = 3, 4, --- 19, are given in Table III. For 
example, for R = 4, we see that S, = 1, S, = 0, S; = 0, 
S, = 1. Putting all ones into the four flip-flops of Fig. 2 
with these S; will cause the sequence of length 2* — 1 = 15 
given in Table II to appear in F,, starting with the first 
shift. 

Using Table III, we may construct SEC-DAEC codes 
of block lengths from five digits to 2’° — 1 digits, a range 
which seems adequate for most applications. 


D. Encoding and Decoding 


The encoding operation for these SEC-DAEC codes 
then can make use of the fact that the parity checks are 
derived from m sequences. For example, the timing 
signals for y, may be obtained directly from the first 
flip-flop of a k — 1-stage shift register, which is started 
with ones in all & — 1 flip-flops. The timing signals for 
Yo to y,z-; are the same, but are shifted in time. 

Decoding is almost as simple as encoding. The first 
step is to form the checking number C,C, --- C;, as in an 
ordinary Hamming code. That is, we set C; = 0 if y; of 
the received word satisfies its parity check; we set C; = 1 
if y; does not satisfy its parity check. The timing signals 
for this operation are derived in the same manner as in 
the encoding operation. 

There are, then, four possibilities: 

1) All the C; are zero: No errors occurred. 

2) C, 1s one; C; is zero for some 2: A single error occurred. 
We take a k — 1-stage shift register with the S; 
given by Table III. This shift register is also started 
with all ones in the flip-flops. We start shifting the 
contents of the flip-flops in the usual manner. After 
some number of shifts, say NV, the mod-2 sum of the 
number in the flip-flops and the first k — 1 digits of 
the checking number will all be ones. 

The error occurred in the Nth digit of the word. 

C,, is zero; C; is one for some 72: A double adjacent 
error occurred. We take a k — 1-stage shift register 
with the S; given by Table III. This shift register 
(which can be the same register used for possibility 


2 In general, these are not the only acceptable values for the C;. 
This table was constructed with the aid of Marsh’s Table [8]. 

3 These codes consider the case where both the first and last 
digits of a word are in error, as a double adjacent error, and auto- 
matically correct the word, if this occurs. 
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TABLE III 


Vawues or S; ror R-Srace Maxima Lenera LiIngEAR Binary 
Surrr Reersrers (Sur Fie. 2) 


R 83 4.5 (6°77 (839 0) (iO as es el eee Oma amen 
So OO OM 0 CT O OO O@ @ © G GO © 
From eOo OOo 7 O 0 ® O © 0 GO 
Seo Ow mw ow OC O OY O@ O ODO OM 
S4 Loon oOo oO O OD O ® ©@ © Ov 
Ss Lite O @ © ® O @ © O 
Ss toh ak wd: 0) wet © © @O O © 
S7 Low a 8 @ © O O@ O OY UF 
Ss hr oO OO * OO O © ©O@ DO 
So 1° OL OL ee Oe ORO ma) 
Sio L@w®odik @® © OD 0 @ 
Su ik Sh Oe OF - 2 @ 
Sp fb ooo © © @ 
Sis On Oe TORO 
Su ihkemek Ur ah 
Sis a Oe 
Sis it. “Oe 0) 
Siz 1 60 
18 1 
19 
2), is started with a one in F, and zeros in all the 
other flip-flops. After N shifts, the mod-2 sum of the 
number in the flip-flops and the first k — 1 digits 
of the checking number will all be zeros. 
The double adjacent error occurred in the Nth and 
N + 1th (mod-n) digits of the word. 
4) All the C; are one: An odd number of errors greater — 


than one occurred. 


SEcTION II 
A. Formation of a Code 


Let us assume we wish to obtain an SEC-DAEC code 
with ten information digits. From Table I, we see that it 
is necessary to use at least five parity digits. Using the 
notation of (1), then, we have 


m = 10 
[ip = & 
n=mt+k = 15. 


RBPRReOOrcCOCOoCoOooCooooocooos 


| 
| 
| 


} 


Now we construct a parity-check table of five rows and — 


fifteen columns, as in Fig. 3. 

The y; row was obtained from the sequence correspond- 
ing to k = 5 in Table II. The y», y3, and y, rows are just 
shifted versions of the y; row. The y; row is, of course, 
trivial, since y; checks all digits. 


B. Error Correcting 


To demonstrate the error correcting process, we list in 


Table IV successive contents of the k — 1 or 4-stage shift 
register, starting with 1111 and also starting with 1000. 
From Table III, we see that S; = S, = 1, and S; = 
S; = 0 

Now assume that we receive a word where only the 
sixth digit is in error. From the parity check table of this 
code (Fig. 3), we see that the checking number corre- 


U1 Uo %3 4 % Xe X7 Xe Xo Xo Yr Yo Ys Ys Ys 


| 
Yen \ 2 ao |e a0 |) abl) ae | 
Y2 o iy a |) a5 a2 || a9) ae 
Ys a a Dah a | ae) ee 
Ys x iy “|| ae || | a 
BEN) Ge |) Ge ae A Nt er ee eee albee ont eel apnea Il ae Be a8 


Fig. 3—Parity-check table for m = 10, k = 5. 


TABLE IV 
Tue Contents oF A Four-Srace Suirr REGISTER 


Number of Shifts (1111 Start) (1000 Start) 


Fy F, F3 F, FP, F, FP; FP, 

1 il 1S Pall te OC 
2 lee rl ie Tk Te: 
3 Osa Oneal be we a all 
4 ] Osis 0 (Ve ie eb eal 
5) Ue OF en Le Om eel 
6 Pe OF 0 el 
7 YO @ a 4 hk @Q it © 
8 ee () ee) tiie a Beal) | 
9 On i OO OL tee lion: 
10 OY OO Lf @ OR Oe et 
11 0 Oi Le OL Ole 
12 10a O eee) Q a Oo 
13 Or OO uo 
14 Tec Sails Ne 90) OO. @, O. ul 
2*—1=15 eel eel aL Lo OF @ 


sponding to an error in the sixth place is 10011. The last 
ligit of the checking number is 1. Therefore, we start 
yur shift register with 1111. After 6 shzfts, (see Table IV), 
the number in the flip-flops is 0110. This will give 1111 
when added to the first four digits of the checking number, 
so we know that the error occurred in the sixth digit. 

Let us now demonstrate the correction of a double 
adjacent error. Assume that the eighth and ninth digits 
of the received word are in error. From Fig. 3, we see that 
the checking number obtained will be 11010. The last 
digit of the checking number is zero. Therefore, we start 
our shift register with 1000. 

After ezght shifts (see Table IV), the number in the flip- 
Hops is 1101. This will give 0000 when added to the first 
four digits of the checking number, so we know that the 
srrors occurred in the eighth and ninth digits. 


Srcrion III 


A. Complete SEC-DAEC Codes 


In this section we shall first investigate the properties 
of SEC-DAEC codes in general. After this, we shall place 
some restrictions suggested by practical considerations 
upon the codes we wish to consider. Finally, we shall show 
now these restrictions lead to the rules given in Section I. 

Consider the set of all possible checking numbers for a 
sode with m information digits and k parity digits. There 
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are 2” of these; for an SEC-DAEC code, each of these 
2" numbers must indicate one of 2(m + k) + 1 possi- 
bilities.* Therefore, we have the inequality 


2" > 2m+h) 41. (2a) 


Now the left side of (2a) is an even number, while the 
right side is an odd number, so that we may add one to 
the right side of (2a) without destroying the inequality. 


2° > Am+k) +2 (2b) 


or, finally, 


m<2*—k—-1. (2c) 


For any given k, let m* be the value of m obtained by 
using the equality in the above equation. That is, 


Ditty eo a 


(3) 


We shall call an SEC-DAEC code with m = m* a complete 
SEC-DAEC code. Values of m* may be obtained from 
from Table I. 


B. The Checking Numbers 


In order to specify completely an SEC-DAEC code of 
n= m+ k digits, it is sufficient to give the n checking 
numbers corresponding to the n possible single errors. 
Given these checking numbers, we can construct the 
parity check table as in Fig. 3, and thus the code itself. 
Let us therefore investigate the restrictions we must 
impose upon these checking numbers. 

Let M; be the checking number which indicates an 
error in the ith digit of the received word. Let N; be the 
checking number which indicates an error in both the 
ith and (¢ + 1)th digits of the received word. Let J, be 
the checking number which indicates that the received 
word contains no errors. J;, must, of course, be the k-digit 
binary number consisting of all zeros. Then we form an 
array of these k-digit binary numbers as shown below: 


Deol Ve 2a 
WET ONES 

(4) 
Meas 


Now we note the following facts: 
a) The M,, N; and J, are all distinct. 
b) Let-4" =" (a7, 0,1 -a,), and 
Baa (GpanDs b,) be two k-digit binary 
numbers. 


4There are (m + k) possible single errors, (m + k) possible 
double adjacent errors (since an error in the first and last digits 
of the block is considered a double adjacent error) and the possi- 
bility of no errors. 


We define the sum of A and B as follows: 


G0 = ere (5) 


(Gig Cay) ss 
where 


€;-= Oy =U: (mod 2). 


Then we must have? 


MoM, SUNG Aine eae (6) 


To these two facts, which must hold for any SEC- 
DAKC code, we may add a third, which is true for com- 
plete SEC-DAEC codes. For this type of code, we have 
[from (3)] 


n=m*+k=2')?—1 (7) 


so that we may state the third fact: 

c) For a complete SEC-DAEC code, the 2n + 1 check- 
ing numbers in the array (4) include all the k-digit 
binary numbers, except one. 

The problem of finding a complete SEC-DAEC code 
with k parity-check digits, then, is equivalent to the 
problem of finding an array of k-digit binary numbers 
satisfying facts a)-c). 

In Section III-D, we shall show how to construct such 
an array. Before doing so, however, it will be useful to 
impose an additional restriction on the array. 


C. All-Check Codes 


It is necessary that the parity checks display some 
sort of regularity so that the equipment to instrument 
a SEC-DAEC code does not make the use of such a code 
impractical. One restriction on the parity check digits 
which arises quite naturally from a consideration of the 
instrumentation of the code, is that there be a simply 
obtained distinction between a single error and a double 
adjacent error. The obvious way to obtain such a dis- 
tinction is to require that some parity check digit, say 
yz, be a parity check over all the digits. If y, checks all 
the digits, then the digit of the check word corresponding 
to y, will be 1 for a single error, and 0 for a double adjacent 
error (or no error). We therefore set the additional re- 
quirement on our codes that: 


The last parity digit, y,, is a parity check 
over all the digits of the code word. (9) 


We shall call a SEC-DAEC code obeying (9) an all- 
check SEC-DAEC code. 
D. Properties of All-Check Codes 


From this point on, then, we shall restrict our attention 
to all-check SEC-DAEC codes. It will simplify the dis- 
cussion to follow, if we first show how the three facts 


5 Subscripts in this and following equations should always be 
taken modulo n. 
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listed in Section III-B are modified for all-check SEC- 
DAEC codes. Note that in the array of checking numbers | 
(4) for such codes, the last digit of each M; is a one, and 
the last digit of each N; is a zero. We define A, as the 
k — 1-digit binary number consisting of the first k — 1 
digits of M;, and B; as the first k — 1 digits of N;. Then, — 
instead of an array of k-digit binary numbers, as in (4), 
we may consider the array of k — 1 digit binary numbers 
given below: 


A, B, 
A, lB%5 
. (10) 
An Bs 


Given the array (10) it is, of course, trivial to con- 
struct the array given in (4), and thus the corresponding 
SEC-DAEC code. Let us then express facts a)-c) in 
terms of the A, and B,, instead of the M@; and N,. This 
will give us the three equivalent conditions [note the — 
definition of addition in (5)]: 
a’) A; + A; = I,-1 if and only if, 7 = j | 
Bb, + B,; = fI,_, tf and only if, 7 = j 
B; ~ Ip; forsany 12. 

b’) Ape Ay PSB eee Tee 

ce’) For a complete all-check SEC-DAEC code, (Aj, — 
A,, --: A,) includes all the k — 1-digit binary 
numbers, except one, and (B,, B,, --- B,) also 
includes all the k — 1-digit binary numbers, except 
one, (J;-1). 

We shall call an array such as (10), satisfying the aboye 
three conditions, a check array. 


E. The Check Array 


If we are given one check array, it is possible to obtain 
other check arrays through certain simple operations. 
example, if (10) is a check array, any cyclic permutation 
of the elements of (10) is also a check array. That is, 


re 5; 
An B, 
A, B. 
; (11) 
A,-1 | epee 


is a check array. 

Furthermore, we may add an arbitrary & — 1 digit 
binary number, C, to each of the A, in a check array, 
as in (12), directly below, and the result will still be a 
check array. 
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AACA. 
Along St OM Ma Jei3 

: fa (12) 
Tee AC: Ba 


Note that the same is not true for the B,. 

By condition c’) we know that the A; and B, each 
nelude all k — 1 digit binary numbers, except one. The 
vumber missing from the B; must be J,-,, while the 
vumber not included in the A; we shall denote by A*. 
By use of the operation indicated in (12), it is possible 
0 set A* of a check array equal to J,_;. We shall call such 
. check array a symmetric check array. 


”. Construction of the Codes 


In Section III-E, we reduced the problem of con- 
structing systematic SEC-DAEC codes to that of con- 
structing symmetric check arrays. Now we shall present 
1 constructive proof to show that a symmetric check array 
s possible for any k > 3. (The k = 3 check array, how- 
ever, does not correspond to a SEC-DAEC code.) 

The construction is quite simple. We let A, be an 
arbitrary k — 1-digit binary number. We start a k — 
i-stage maximal-length linear binary shift register [4]-|7] 
vith A,, and let A,, A3, --- A, be the successive entries 
n the shift register. Given the A;’s, the B,’s are de- 
ermined by b’). It is then only necessary to show that 
the set of A,’s and the set of B,’s each include all the 
¢ — l-digit binary numbers, except J,-1, once and only 
ynce. That this is true for the A,’s follows directly from 
the definition of a maximal-length linear binary shift 
register (see Section I-C). As for the B,’s, we show in the 
Appendix that the B;’s are just a shifted version of the 
A ,’s. That is, 


B; = Ass (13) 


for all z 
and some fixed s (7 + s is taken mod n). 


Therefore, the B;’s also include each k — 1-digit binary 
aumber, except J;,-1, once and only once, and the proof 
shat this construction leads to a symmetric check array 
s complete. 


G. A Modification 


The previous section shows how to construct symmetric 
sheck arrays, and thus systematic SEC-DAEC codes. 
In order to construct the codes in accordance with the 
imple rules of Section I-B, however, it is necessary to 
nodify slightly the codes we have just obtained. 

This modification is accomplished in two steps. 

1) Select A, as the k — 1-digit binary number, con- 

sisting of all ones, except for a zero in the first digit. 
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This step will insure that we select a symmetric 
check array ending in all ones. 
2) After obtaining the symmetric check array ending 
in all ones, add C to every A; of this check array, 
where C consists of k — 1 ones. The array obtained 
from this operation will still be a check array [that 
is, it satisfies a’)-c’)], but it will not be a symmetric 
check array. For this check array, the binary 
number not included in the A; column is composed 
of all ones, while the binary number not included in 
the 6; column is composed of all zeros. We shall call 
such a check array an antisymmetric check array. 
Now the key property possessed by the particular 
antisymmetric check array, which we construct with 
steps 1 and 2, is the composition of A,, An_1, Ano, °° 
A,-«-2). The k — 1 digits of A, are all zeros; the first 
k — 2 digits of A,-, are zeros; the first k — 3 digits of 
A,-2 are zeros; and so on, until we reach A,_(,-2), whose 
first digit is a zero. We recall that A; consists of the first 
k — 1 digits of the checking number which indicates that 
the 7th digit of the word is in error. When we construct 
the parity check table (see Fig. 3) corresponding to these 
A,, then we find that: y, does not check any of the other 
Yi; Y2 Goes not check ys, Ys, *** Yx-1, OF Yx; Ys Goes not 
check ys, Ys, °** Ye-1, OF Yx; and so on, until y,_1, which 
does not check y;,. 

A given parity check digit of these codes then may in 
general check any information digit and any preceding 
parity digit. 


H. The Number of Different Codes 


From Sections III-E and III-G, we see that any 
maximal-length linear binary shift register corresponds 
to a SEC-DAEC code. Since there does not appear to be 
any reason for preferring one such register over another 
with the same number of stages,’ we have listed in Table 
III one register each, for R = 3, 4, --- 19, where RF is the 
number of stages. Zierler [5] has shown that the total 
number of different R-stage maximal-length linear binary 
shift registers is given by 


g(2" — 1) 


R (14) 


where ¢(k) is known as Euler’s Phi-function. For any 
positive integer k, o(k) is just the number of positive 
integers less than k and prime to k, 1 included. In Table 
V, we have given 


¢(2* — 1) 
R 


for R = 3, 4, --- 19. (This table was obtained from 


Marsh’s tables [8].) 


6 This neglects the fact that some registers require more mod-2 
adders than others. Table III gives registers which require the 
minimum number of adders for R < 13. 
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TABLE V 
Tue NuMBER OF DirrERENT R-STAGE 
Maximat-Leneru Linsar Binary SHIrr REGISTERS 


o(2* — 1) 

R —_ 
Rk 
3 2 
4 2 
5 6 
6 6 
7 18 
8 16 
9 48 
10 60 
11 176 
12 144 
13 630 
14 756 
15 1800 
16 2048 
17 7710 
18 7776 
19 27594 


[6(2® — 1)/R] = number of different R-stage maximal-length linear 
binary shift registers. 


I. Another Type of SEC-DAEC Code 


The linear binary feedback shift register was introduced 
as a particular method of obtaining an antisymmetric 
check array (see Section III-G). It is, however, not the 
only method. An antisymmetric check array which 
cannot be obtained from a linear shift register’ is given 
here. 3 


1000 O13 050 
ie le0R0 0010 
betel. 0 VOLO RL 
Ovlaled Pel FOc0 
ISOcte: inte Ta) 
ODO 1a EN ek 
ANP eA) OF lela 
del cOge 1 0% teal (15) 
Oreo OSLeO a1 
OLOetet 1.0.10 
FORO Ab f70 2 
Os 070 Orlst-0 
0010 OO miet 
QO 0.081 0001 
0000 1000 


This array defines a systematic SEC-DAEC code. 
The instrumentation of this code, however, appears to 
involve a good deal more equipment than the instrumen- 
tation of codes derived from linear shift registers. 


J. Related Codes 


It is instructive to view the class of codes obtained in 
this paper as a particular type of single-error-correcting- 
double-error-detecting (SEC-DED) code. Consider an 
SEC-DED code with m information digits, k parity digits, 
and n = m + & total digits. We require not only that 


7 This check array was obtained by W. Michaels. 
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the code detect all double errors, but also that the possible 
checking numbers obtained at the output for this code shall 
divide the set of (n) (n — 1)/2 possible double errors into n 
disjoint sets. Each of these disjoint sets shall contain one, 
and only one, double adjacent error (counting an error 
in the first and last digits as an adjacent error). It is easy 
to show that, for any k, the upper bound on m for such a 
code is just m*, where m* was defined in Section III-A. 
As a matter of fact, the SEC-DAEC codes obtained here 
are an example of this type of SEC-DED code. 

The above paragraph suggests the possibility of obtain- 
ing single error correcting (SEC) codes derived from the 
SEC-DAEC codes given here. The construction of such — 
SEC codes is almost trivial. We merely require that the 
checking numbers for such an SEC code be the comple- 
ments of the numbers in the flip-flops of a maximal- 
length linear binary shift register. The SEC codes obtained 
in this manner will require the same number of parity 
digits as ordinary Hamming codes. 


APPENDIX 
A PROPERTY OF m SEQUENCES 
We wish to show (see Section III-F) that if 
A, 
A, 


A, 


are the successive entries in the flip-flops of a maximal-_ 
length linear binary shift register, and 


B, = A; + Ajai @#=1,2,---n—-—1 
Bie Ae iad 
then 
Bea 
for 
7=1,2,---n 


and some fixed s 
(¢ + s) is taken mod n. 


To simplify the notation, we relabel as follows: 


A, = A= "(a, a, Oe) 
A, = B = (b,'b3 <-> by-3) 
A, = C=(G @ -+: G4) 
B, = D = (a, dz --+ dy-1) 
B, = EH = (& @, «+» @4); 


’ The complement of a binary number is obtained by replacing 
all zeros in the number by ones and all ones by zeros, 
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Let the values of S; (see Fig. 2) for the register pro- 


Hucing A,, A,, --- A, be 8,, So, --- S,-,. Then we must 
ave 
Ds. peek 7=2,3,°---k—-—1 (19) 
k-1 
b, = DS, S,a; (20) 
and 
Gi = 0.25 (=e Oy rae) (21) 
1 k-1 
) Ce » S,b; (22) 
Furthermore, we have from (16) and (17) 
d; = a; + d, (23) 
C= DF C;. (24) 
Then 
é&=b,+ Cy 
k-1 k-1 
= DS, S,a; a5 Se S;0; 
i=1 q=1 


k=1 


=> Ss S; Obs 


a=1 
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Furthermore, fori = 2,3, --: k — 1 
on = ln == C5 
=the clea (26) 
= d;_,. 


Now, since the arguments given will apply to any two 
successive B;, (25) and (26) show that the sequence of the 
B,; may be obtained from the same shift register as the 
A,;. Finally, since this is a maximal-length shift register, 
the sequence of the B; must just be a shifted version of 
the sequence of the A;, and (18) must hold. 
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A Probabilistic Model for Run-Length Coding of Pictures’ 


JACK CAPON+ 


Summary—A first-order Markoff process representation for 
pictures is proposed in order to study the picture coding system 
known as run-length coding (differential-coordinate encoding). A 
lower bound for the saving in channel capacity is calculated on 
the basis of this model, and is compared with the results obtained 
by previous investigators. In addition, this representation is shown 
to yield an insight into the run-length coding system which might 
not otherwise be obtained. The application of this probabilistic 
model to an “elastic” system of run-length coding is also discussed. 
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work was done when the author was employed by the Ford Instru- 
ment Company, Long Island City, N. Y., during the summer of 
1958. 

+ Federal Scientific Corporation, New York, N. Y. Formerly 
with the Dept. Elec. Engrg., Columbia University, New York, N. Y. 


INTRODUCTION 


or bandwidth is possible by exploiting the large- 

scale redundancies in pictures. One of the first 
proposals advanced for taking advantage of these re- 
dundancies is the variable sweep system of Cherry and 
Gouriet.' The authors found that theoretically a saving 
of approximately seven in either time or bandwidth is 
possible with their system. 


le has been recognized that a saving in either time 


1. C. Cherry and G. G. Gouriet, “Some possibilities for the 
compression of television by recoding,’”’ Proc. IEE, vol. 100, pt. III, 
pp. 9-18; January, 1953. 
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Another scheme which has been proposed is the run- 
length coding, or differential-coordinate encoding system.” 
Although this system is applicable to half-tone pictures” 
we shall, for simplicity, consider its implementation only 
for black and white pictures. The basic idea in this scheme 
is to transmit only the lengths of the black and white runs 
in a picture as they occur in successive scanning lines. 
The first implementation of this system appears to be 
that of Treuhaft.* However, he gives no results con- 
cerning the possible reduction in either time or band- 
width for a run-length coding type system. Deutsch’ has 
measured the probability distribution of runs in a section 
of typewritten material, and concluded that a saving of 
approximately two is possible in a run-length coding 
system which uses an optimum code. More extensive 
studies of the run-length probability distributions for 
typewritten material has been made by Michel® who 
concluded from his findings that theoretically a saving 
of approximately ten is possible with a run-length coding 
system which employs an optimum code. 

These results’’® are sufficiently encouraging to warrant 
further investigation of the run-length coding system. The 
present work stems from a desire to predict more generally 
the saving that is possible with a run-length coding 
system for various types of black and white pictures, 
and to gain an insight into this system. The analysis is 
carried out only for black and white pictures, and for a 
binary digital transmission channel. 


Tue DESCRIPTION OF A PIcTURE IN TERMS OF A 
Frrst-Orper MArKkorr PRrocsss 


The process of scanning reduces a picture from a two- 
dimensional array of cells (resolution elements) to a one- 
dimensional sequence of cells. In the case of a black and 
white picture such a sequence would consist of a succession 
of black and white cells. A section of this sequence might 
appear as below. 


--» BBBWWBBBBBBWWWWBWBBWBWYW --- 


Thus, in our subsequent discussion, when we use the 
word “‘picture” we shall, in reality, be referring to a one- 
dimensional sequence of cells which results from scanning 
a picture. (Note: Since there are many ways to scan a 
picture, this representation is not unique. However, for 
our purposes, this is unimportant.) 


2 A. KE, Laemmel, “Coding Processes for Bandwidth Reduction 
in Picture Transmission,’’ Microwave Res. Inst., Polytechnic Inst. 
of Brooklyn, Brooklyn, N. Y., Rept. No. R 246-51; August, 1951. 

3. W. F. Schreiber and C. F. Knapp, ‘““TV bandwidth reduction 
by digital coding,” 1958 IRE Nationan Convention ReEcorp, 
pt. 4, pp. 88-99. 

4M. A. Treuhaft, ‘Description of a System for Transmission of 
Line Drawings with Bandwidth-Time Compression,’’ Microwave 
Res. Inst., Polytechnic Inst. of Brooklyn, Brooklyn, N. Y., Rept. 
No. R 339-53; September 4, 1953. 

5S. Deutsch, ‘Some Statistics Concerning Typewritten or 
Printed Material,’ Microwave Res. Inst., Polytechnic Inst. of 
Brooklyn, Brooklyn, N. Y., Rept. No. R-526; October 31, 1956. 

6 W.S. Michel, “Statistical encoding for text and picture com- 
munication,’? Commun. and Electronics, vol. 35, pp. 33-86; March, 
1958. 
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The assumption, on which the analysis presented 
herein is based, can now be stated. 

The transition in intensity from a given cell to the im- 
mediately succeeding cell is determined solely by the 
intensity of the given cell. 

This dependence is a probabilitistic one which is given 
in terms of the transition probabilities p,,(w), p,(d), 
p,(b), and p,(w), where’ 

p(w) is the probability that a cell is white, given that 

the immediately preceding cell is white; 


p»(b) is the probability that a cell is black, given that | 


the immediately preceding cell is white 


= 


p,(b) is the probability that a cell is black, given that 


the immediately preceding cell is black; 


p,(w) is the probability that a cell is white, given that — 


the immediately preceding cell is black 
= 1 ae pr(b). 


Two other parameters which enter into the analysis are | 


are p(w), and p(b), where® 
p(w) is the probability of obtaining a white cell; 
p(b) is the probability of obtaining a black cell 


= 1 — p(w). 


The transition probabilities can be related to each 
other, by noting first that 


p(w; 6) = pb; w) 


where 


p(w; b) = the joint probability of obtaining a white | 


cell followed by a black cell; 
p(b; w) = the joint probability of obtaining a black 
cell followed by a white cell. 
Hence, since p(w; b) = p(w)p,.(b), and p(b; w) = p(b)p.(w), 
we obtain 


p(w)p.(b) = p(b)p(w). 


It is easily seen that only two of the six probabilities 
p(w), p(d), Dw(w), Dw (0), p»(b), p(w) are independent; 


for example, an independent set comprises p(w) [or — 


p(b)], and any one of the four transition probabilities. 
At this point we note that the assumption is equivalent 
to stating that a picture can be represented by a first- 
order Markoff process with stationary transition prob- 
abilities.” This model for an information source has been 
proposed previously by Oliver,’° among others. 


7 It is well to emphasize that the transition probabilities are not 
unique, since they depend on the particular scanning direction 
which is used. However, for a given scanning direction the transi- 
tion probabilities are unique. . 

5 p(w) and p(b) are unique for a particular picture, since they 
do not depend on the scanning direction. 

°W. Feller, “An Introduction to Probability Theory and Its 
Applications,’ John Wiley and Sons, Inc., New York, N. Y., vol. 
1, 2nd ed.; 1957. 

10 B. M. Oliver, “Efficient coding,” Bell Sys. Tech. J., vol. 31, 
pp. 724-750; July, 1952. 


Since a transition from a given intensity to any other 


eps, it follows that our Markoff process representation 
r pictures is transitive, and thus ergodic. This fact will 
e of importance to us in our subsequent work. 

It should be mentioned that our model does not take 
nto account all of the dependence which exists among the 
arious cells in the picture. Hence we can only expect 
hat our results concerning information rates and savings 
n channel capacity are first-order approximations to the 
rue values. 

In particular, our results for the saving in channel 
apacity will at best be crude lower bounds. However, 
t should be borne in mind that a model which takes into 
count more of the dependences among the cells in the 
jicture than ours will not be as amenable to analysis as 
yur model. 

Elias’ has calculated the possible saving in channel 
apacity of a run-length coding system on the basis of a 
nodel in which successive cells in the picture are con- 
idered to be independent. 


NuMBER oF Bits REQUIRED TO CopE A PIcrURE BY 
MErAns oF A NONSTATISTICAL CODING SYSTEM 


The present-day picture transmission method which 
pecifies the intensity of each cell in a picture in- 
lependently of the intensity of any other cell will be 
ermed a nonstatistical coding system. This system does 
iot exploit the redundancies which exist in a picture; as 
, consequence, the system is inefficient in the sense that 
{ requires a larger number of bits to specify the picture 
han is actually necessary. 

We now proceed to calculate the number of bits N, 
equired by a nonstatistical coding system to specify a 
yicture. Since the nonstatistical coding system specifies 
he intensity of each cell in a picture independently of all 
he other cells, it is clear that one bit is required for each 
ell in the picture. If we let N represent the total number 
f cells in the picture, then it is apparent that NV, is equal 
pV’. 

It is advantageous to pause at this point to investigate 
he number JN. If we consider that an average picture is 
34 by 11 inches and that the resolution is 100 lines per 
nch in the vertical and horizontal directions, then N will 
ye (8.5)(11)(100)”, or approximately one million. Hence, 
ve observe that N is very large, and indeed for our purposes 
nay be considered to be infinite. We shall make use of 
his fact in our subsequent work by investigating the 
yehavior of coding systems as N approaches infinity. 
since N is very large, such a limiting behavior will tend 
o specify the true situation quite closely. In addition, 
aking limits as N approaches infinity enables us to 
implify considerably the expressions which will be en- 
ountered. 


1 P, Elias, “Predictive coding,” IRE Trans. on INroRMATION 
THEORY, vol. IT-1, pp. 16-33; March, 1955. 
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tensity is possible in a sufficiently large number of - 
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Minimum NuMBER oF Bits REQUIRED TO CODE THE 
MarkorF Source REPRESENTATION OF A PICTURE 


We now consider the minimum possible number of 
bits, Ninin, required by a statistical coding system to code 
the Markoff source representation of a picture. It is well 
known’”’”’ that if the Markoff process is ergodic, and is 
described by the transition probabilities introduced 
previously, then the statistical coding system which 
achieves this minimum is one which codes the intensity 
of each cell on the basis of a knowledge of the intensity 
of the immediately preceding cell. If a binary alphabet 
is used, then it can be shown that Nain/N is equal to 
the conditional information content (conditional entropy) 
of the Markoff process; 7.e.,"* 


Nae = —p(b)p,(b) log p,(b) — p(b)p.(w) log p,(w) 


(1) 


We emphasize again that any coding system, whether 
statistical or nonstatistical, cannot code a picture with 
a fewer number of bits than N,,;,. However, it is possible 
for a coding system to achieve the lower bound JN,,;,. In 
fact, as we shall see subsequently, the run-length coding 
system does achieve this lower bound as N approaches 
infinity. 


—p(w)p..(w) log p,.(w) — p(w)p..(b) log p.(b). 


NuMBER oF Bits REQUIRED BY THE RuN-LENGTH CoDING 
System To CopE THE MArKorr PRocEss 
REPRESENTATION OF PICTURES 


Unlike the statistical coding scheme described above, 
which codes individual cells, the run-length coding system 
codes groups of cells.'® Each group consists of an all-white 
or all-black linear array of cells. Such arrays or “runs,” 
are readily found along the customary scanning lines. 
The run-length coding system counts the number of cells 
in each run and suitably encodes this number for trans- 
mission. If a binary alphabet is used in the coding process, 
then it can be shown” that the minimum number of bits 
required to code the run-lengths is given by the informa- 
tion content of the probability distribution of the run- 
lengths. Thus, in order to calculate the minimum number 
of bits required by the run-length coding system, we 
shall digress and find the probability distribution of the 
run-lengths. It should be pointed out that no specific 
codes are proposed (e.g., Shannon-Fano-Huffman codes), 
but that only the minimum number of bits required to 
code an information source is computed. 


122©, EK. Shannon, “A mathematical theory of communication,”’ 
Bell Sys. Tech. J., vol. 27, pp. 379-473, July, 1948; pp. 623-656, 
October, 1948. 

13 A. I. Khinchin, “Mathematical Foundations of Information 
Theory,’’ Dover Publications, Inc., New York, N. Y., pp. 13-28; 
1957. 

14 All logarithms are to the base two. 

15 W. S. Michel, W. O. Fleckenstein, and E. R. Kretzmer, “A 
coded facsimile system,’ 1957 WESCON Convention Recorp, 
pt. 2, pp. 84-93. 


160 


A white run of length « (= 1, 2, ---) is defined as a 
set of x consecutive white cells preceded by a black cell 
and succeeded by a black cell. A black run of length x is 
defined analogously. 

Let FR, and FR, be the random variables which represent, 
respectively, the length of a white and a black run. In 
view of the assumption we have made, we see that as NV 
approaches infinity, the probability of obtaining a white 
run of length equal to x is the same as the probability 
that (« — 1) transitions from white to white have taken 
place, and that this is followed by a transition from white 
to black. Thus’® 
prob (R,, = 2) = p,(w)* 'p.(b), = 1,2, 

(N —> o~). 


Similarly, it is obtained that the probability of obtain- 
ing a black run of length x is 


x) = p,(b)”” 


(2) 


ti 
(N > @-). 


prob (R, = ‘pr(w), (3) 


If we let H,, denote the information content of the 
white run-length probability distribution, then 


lal = 2) log prob (i, -=.2) 


=>) prob (ite 


—log [1 — pulw))/puw)] 
— [1 — po(w)]* log pow) (4) 


where (4) is obtained by making use of the summation 
formulas 


I 


Ya = -@ aneccont (5) 
D aa’ = lei) ar pe<eul. (6) 


Similarly, if we let H, denote the information content of 
the black run-length probability distribution, we obtain 
Hi, 


x) log prob (R, = 2) 


_ 33 prob (R, = 


— log [[1 — p.(b)]/p.()] 
= [1 —spsO)l) Meee. (0). 5 9 @) 


At this point we note that one of the consequences of 
the assumption that N approaches infinity is that the 
run-lengths are independent. If N is finite then any set 
of 7 run-lengths must satisfy the inequality that the sum 
of their lengths be less than or equal to N; hence these 
run-lengths are dependent. If we allow N to approach 
infinity then any subset of k (k < 7) run-lengths will 
satisfy the inequality independently of the other k — 7 
run-lengths; 7z.e., the set of k run-lengths is independent 
of the set of k — 7 run-lengths. Since 7 and k are arbitrary 
it follows that all of the run-lengths will be independent 


16 The probability distribution of R,, as well as that of Re, is 
known as a geometric distribution. ° 
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as N approaches infinity. This fact will be used in ou 
subsequent discussion. 
Since all the black and white runs which constitut 


the picture have the same probability distribution, it is 
obvious that the total number of bits required by the 
run-length coding system is obtained by summing the 


information contents H,, and H, over the entire picture. 
We denote the total number of runs by n; since 7 is very 
large we can assume, without loss of generality, that 7 
is even so that there are n/2 white and n/2 black runs 
in the picture. Hence the total number of bits Vz required 
to specify the Markoff process representation of the 
picture by run-length coding is 


Np = (n/2)(H, + M) 
= (n/2)(—log (1 — pw(w))/puo(w)) 
— (1 — pw(w))™ log pu(w) 
— log (1 — po())/po(0)) 
— (1 — p,(0))~ log pb). 


We are now in a position to prove the following theorem. 

Theorem: In the limit, as N approaches infinity, the 
number of bits Vz required to code the black and white 
runs in a first-order Markoff chain is equal to the theo- 
retical lower bound Nyjn- 

Proof: From (1) and (8) we obtain 


Ner/Nmin 
= limit (»/2N)(1 — p(w)” + (1. — p(d))”’). 


N- © 


We now digress to evaluate the limity... (n/2N). In 


) 


q 
‘ 


order to do this we note that the sum of the lengths of 
the white and black run-lengths is equal to the number | 


of cells in the picture. That is, 


n/2 


dy (Ra + Bs) = 


(10), 


where Ri and R} are defined as the random variables’ 


which represent, respectively, the length of the 7th white 
and 7th black runs. 
Eq. (10) can be rewritten as follows: 
n/2 
(2/n) 2, (R. + Ri) = 2N/n. (11) 
However, (2/n) 30°, Ri, is a sum of independent and 
identically distributed random variables; hence, by 
the Kolmogoroff Strong Law of Large Numbers” 
(as N >), this sum converges with probability one to 
the expected value of the white run lengths, E(R,,) [if 
E(R.) < ©]. Similarly, (2/n) >°°2 Ri converges to 
E(R,) [if H(R,) < ©], with probability one, as N ap- 
proaches infinity. Thus, 
limit (2N/n) = E(R,,) + E(R,) 


N-o 


(12) 


17M. Loeve, eae Theory,” D. Van Nostrand Co., Inc., 
Princeton, N. ae 


hich, by (2) and (3) and the summation formula in 


= (11 = po@o)\ -- [= ps(b)) I: (13) 
bstituting (13) into (9), we obtain 


Nee — iL 


rhich proves the theorem. 

|The result stated by our theorem is not surprising, 
ince both Nz and N,;, represent the number of bits 
quired to specify N cells in the first-order Markoff 
hain. The quantity N,,;, is calculated on the basis of 
dividual cells and Vz is calculated on the basis of groups 
f cells which become independent as N approaches 
nfinity. Thus Np, and N,,;, must be the same as N 
ipproaches infinity. The fact that the groups of cells are 
ndependent is important; this will be seen to be true 
yhen we consider a run-length coding system which 
ses the same code for black and white runs of the same 
eng th. 


SAVING IN CHANNEL CAPACITY 


_ We now turn our attention to the calculation of the 
aving in channel capacity S of the run-length coding 
ystem with respect to the non-statistical coding scheme. 
t is apparent that S is given by ; 


N/Ne. 


mn view of (1) and the theorem which we have just 
oroved, (14) becomes 


SSN (Ng = N/ Nain 
= (—p(b)p.(b) log p.(b) — p(b)p.(w) log p(w) 
— p(w)p..(w) log p..(w) — p(w)p.(b) log p,,(b))™. 


ce (14) 


(15) 


since Ni, 18 less than or equal to N for all values of 
(w) and of transition probabilities, it follows that S 
s always greater than or equal to unity, for all types of 
victures. Thus, our probabilistic model predicts that 
he saving in channel capacity is always greater than 
r equal to unity for all possible pictures. For a given 
ransmission facility the saving in channel capacity 
yecomes a Saving in time. For a given transmission time, 
he saving in channel capacity becomes a saving in 
yandwidth. 

A graph of the saving is shown in Figs. I(a) and 1(b), 
vhere for the sake of convenience the independent 
ariables are chosen to be p(w) and p,,(w). It is easily 
ound that for a fixed p(w), the minimum value of S, 
enoted by Siin(p(w)), occurs when p,(w) = p(w), and 
s given by 


nin(p(w)) = (—p(w) log p(w) — p(b) log p(b))”’. (16) 


The minimum possible value of S,.in(p(w)) occurs when 
(w) = 4, and is equal to one. Thus, the saving is one for 
ictures which contain equal black and white areas, and 
ave a transition probability p,(w) of 3. In such cases, 
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Fig. 1—(a) Saving in channel capacity vs transition probability 
p(w), for various values of p(w) between 0 and 3. (b) Saving in 
channel capacity vs transition probability p(w), for various 
values of p(w) between $ and 1. 


there would be no point in using a statistical coding 
system, and we would no doubt use a nonstatistical coding 
system. However, such pictures very rarely occur in 
practice. In fact, such a picture would be the equivalent 
of a random pattern, in which there is no dependence 
among the various cells. Hence our probabilistic model 
predicts that for pictures which are equivalent to random 
patterns there is no point in using a run-length coding 
system, since there is very little to be gained. 
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We turn our attention now to the question of when there 
are large savings. From Fig. 1(a) and 1(b) we observe 
that large savings are possible for the following three 
cases: 

1) If p,(w) is close to unity, then the saving will be 
large, regardless of the value of p(w). This corresponds 
to the situation where the white runs in the picture are 
very long. In this case a very large saving can be ex- 
pected. 

2) If p(w) is close to zero, or one, then the saving is 
again very large, regardless of the value of p,(w). This 
situation corresponds to a picture which is almost com- 
pletely black, or completely white, and again it is to be 
expected that a large saving is to be obtained by coding 
runs rather than using a nonstatistical transmission 
method. 

3) If p,,(w) is close to zero, and p(w) is close to one- 
half, then the saving is again very large. The situation 
is that corresponding to a picture which is a checkerboard 
pattern. Thus, since it is known that all black and white 
runs in the picture will be of unit length, the transmitter 
need send no bits to specify the picture (except possibly 
for one bit to specify whether the first cell is black or 
white). 

We observe that for given values of p(w) the curves 
of Fig. 1(b) do not extend beyond certain critical values 
of p,,(w). This is so, since values of p,,(w) less than these 
critical values lead to impossible values for the other 
transition possibilities. 


CoMPARISON WITH DeEutTscu’s RESULTS 


It is of interest to compare the saving in channel 
capacity predicted by the probabilistic model with the 
value of saving found by previous investigators who 
have obtained that saving by actually measuring the 
probability distribution of the run-lengths in a picture. 
Towards this end, we compare our results with those of 
Deutsch,’ who has measured the run-length probability 
distribution for the two cases of horizontally and vertically 
scanning a picture. From his results we obtain that 


p(w) = 4107/5025, 


and that the transition probability p,,(w) in the vertical 
direction is 


D»(w) = 3702/4107 (vertical direction). 


Substituting these values in (15) we obtain 


Seal 48 (vertical direction). 


The corresponding value obtained by Deutsch is 2.02. 
The transition probability in the horizontal direction is 


3640/4107 (horizontal direction). 


Po(w) = 


The value of p(w) is of course the same for both the vertical 
and horizontal directions. Substituting these values in 
(15), we obtain 


S = 1.67 (horizontal direction). 
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The corresponding value found by Deutsch is 2.14. Thus 
the saving in channel capacity predicted by our model is 
in error by approximately 12 per cent in the vertica 
direction and about 22 per cent in the horizontal direc 
tion. In both cases the probabilistic model predicts ¢ 
saving which is less than the true value. In view of the 
rather close agreement found for the value of the saving 
we would feel that the probabilistic model is a good one 
for predicting the amount of saving possible for the run- 
length coding system. However, more results of this 
kind would certainly have to be obtained before an 
general statement to this effect could be made. 


Tue User oF THE SAME CopE ror BLACK AND WHIT 
RuNS OF THE SAME LENGTH VS DIFFERENT CoDE 
FOR BLACK AND WHITE RUNS OF THE SAME LENGTH 


It is certainly simpler to implement a run-length coding 
system which uses the same code for black and white runs 
of the same length than one which uses a different code 
for black and white runs of the same length. However, we 
have seen that our model predicts that this latter method 
achieves the lower bound of (1), as V approaches infinity, 
and thus offers a greater saving than the former method. 
We now investigate how much more of a saving is possible - 
with this latter method than with the former. In order to 
do this we must compute the information content of the 
probability distribution of a run-length which can be 
either white or black. 

Let R denote the random variable which represents the 
length of either black or white runs. Then, by reasoning 
which is similar to that used in obtaining (2) we find 


prob (R = x) = p(w)p..(w)* "p..(b) + p(b)p.(b)* “p,(w) 
p= 12>. | 
(17) 


Denoting the information content of this probability 
distribution by Hs, we find that 


(N > ~). 


fee D3 (—p(w)p.(w)* p(B) — p(b)p.(b)*p.(w)) log 


-(plw)pa(20)**pe(B) + p(d)ps(0)?p,(w)). (189 


If the number of bits required to code the picture by 
this method is denoted by Ns, then 


Ns = nH s. (19) 


It is easily seen that if p(w) = p(b) = 3, then Ns = Nyip. 
Thus this method of coding achieves the lower bound of 
(1) for pictures which contain equal black and white 
areas. Hence for pictures which contain roughly the same 
number of black and white cells there is no advantage 
to be gained in coding black and white runs ‘of the same 
length with different codes, instead of coding them with 
the same code. 

If p(w) [or p(b)] is close to zero, then it is easily seen 
that Hs is approximately twice as large as H,,;,. Thus in 
this case there is an advantage to be gained in using a 


fferent code for black and white runs of the same 
igth, rather than using the same code. 
A straightforward calculation shows that run-lengths 
this coding scheme are no longer independent, except 
en p(w) = p(b) = 3. This accounts for the fact that 
s and N,,;, are different for those pictures which do not 
pave the same number of black and white cells. 
EXPONENTIAL CHARACTER OF THE RuN-LENGTH 
PROBABILITY DisTRIBUTION 


The probability distribution of the black and white 
ms [(2) and (3)] as predicted by our probabilistic model 
sr pictures has an exponential character, as shown in 
" 2. This result also bears some resemblance to the 
sults for the probability distributions measured by 
ther investigators. Michel® finds a probability distri- 
ution of runs which has a very strong exponential trend. 
pme of the results obtained by Deutsch” also indicate 
strong exponential trend. However, some of his results 
0 not indicate such a trend. In particular, the probability 
istribution that he finds for the white runs in the vertical 
irection does not indicate an exponential trend, and in 
ict exhibits no trends. 


APPLICATION OF THE PROBABILISTIC MODEL TO AN 
“Exastric”’ System or Run-Lenctu Coping 


It will be recalled that the saving in channel capacity 


calculated for a particular probability distribution of 
in-lengths. Thus, the saving pertains only to that set of 
ictures which has the aforementioned probability dis- 
‘bution. If a picture with a different probability dis- 
‘bution is coded by the run-length coding system, then 
, is possible that there will be no saving in channel 
apacity. That is, if the code is not “matched” to the 
robability distribution of run-lengths in’ a particular 
icture, there may not be a saving in channel capacity.” 
‘his is one of the disadvantages of the run-length coding 
ystem. One method of solving this problem would be to 
1easure the run-length probability distribution of the 
icture before the picture is coded, in order to match the 
ode to this distribution. Such a system, which changes 
s code for different pictures, is known as an “‘elastic’’ 
ystem. In practice, an elastic system which measures the 
in-length probability distributions directly is difficult 
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Fig. 2—Exponential character of the probability distribution of 
run-lengths. 


to implement, since the measurement of run-length 
probability distributions is difficult to make. 

As an alternative to the above method there is the 
possibility of computing the probability distribution of 
run-lengths on the basis of the probabilistic model de- 
scribed previously. The outstanding feature of this model 
is that it is described by parameters which are very 
simple to measure in practice. The transition probabilities 
which describe the model can be measured. very easily 
by means of the optical correlator used by Kretzmer™* 
and other investigators’ in their studies concerning the 
measurements of the statistics of pictures. Once these 
transition probabilities have been measured, the elastic 
run-length coding system could match its code to the 
run-length probability distributions which are calculated 
from it. In this manner, the elastic run-length coding 
system could achieve a greater saving in channel capacity, 
on the average, than would be otherwise possible. 
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A Note on Invariant Relations for Ambiguity and 
Distance Functions’ 


CoA] STU 


Summary—Woodward‘s result for the ambiguity function, that 
the volume associated with its squared magnitude over the time- 
shift and frequency-shift plane is a constant, has been shown to 
be true also for a cross-ambiguity function for two time functions. 
If complex time functions have been obtained by means of a Hilbert 
transformation from real time functions, it is found for the cross- 
ambiguity function that the volumes under the squared real part 
and under the squared imaginary part are constant and contribute 
equally to the volume under the squared magnitude function. A 
“distance” function for two time functions is defined to be the 
integrated squared difference between these functions. The rela- 
tion for the squared real part of the ambiguity function readily 
yields an invariant relation for the volume associated with this 
distance function in the case of Hilbert-derived complex time 
functions. An especially simple invariant relation for the ‘‘mean”’ 
distance, as computed over the time-shift and frequency-shift 
plane, exists for such time functions having finite energy and 
finite mean value. 


I. INVARIANT RELATION FOR THE AMBIGUITY 
FUNCTION 


NE of the problems in radar is to identify un- 
() ambiguously two or more targets which may be 

arbitrarily close to each other both in range and 
velocity. In order to describe those attributes of a radar 
signal which determine how well such an identification 
can be made, Woodward has proposed the use of the 
correlation function for a combined time shift and fre- 
quency shift, which had been introduced earlier by 
Ville.’’* This function, which is frequently referred to as 
the ambiguity function, is defined in terms of a time-shift 
variable 7 and a frequency-shift variable A as 


ERE il u(tu*(t + ne?! dt (1) 
i. ( UO Meme adi (2) 
where w(t) is the complex low-frequency modulating 


waveform of the radar signal and U(f) its Fourier trans- 
form. Eq. (1) expresses x(7, A) as a Fourier transformation 
of the time function u(t) w*(t + 7) into the A domain 
with 7 a parameter, while (2) expresses it as a Fourier 
transformation of the frequency function U*(f) U(f + A) 
into the tr domain with A a parameter. 


A fundamental result given by Woodward for the 
* Manuscript received by the PGIT, May 1, 1959. 
+ Information Studies Sec., General Electric Res. Lab., 


Schenectady, N. Y 

1P. M. Woodward, “Probability and Information Theory, 
with Applications to Radar,’’ McGraw-Hill Book Co., Inc., New 
York, N. Y., ch. 7 and sec. 2.9; 1953. 

2 J. Ville, “Théorie et application de la notion signal analytique,”’ 
Cables and Transmission, vol. 2, pp. 61-74; January, 1948. 


ambiguity function is that the volume under the surface 
described by |x(7, A)|’ is a constant, 7.e., 


Lobe! 


where that constant is unity if w(t) is normalized so that 


i aun ab 


—o 


x(7, A) | dr dA = (3) 


(4) 


Further studies of the ambiguity function, with par- 
ticular reference to the radar waveform design problem, 
have been made by Siebert and Lerner.* ° 


II. Invartant RELATION FOR THE CRross-AMBIGUITY 
FUNCTION 


The preceding result may be said to apply to the auto- 
ambiguity function. If a cross-ambiguity function is defined 
for two complex waveforms u,(t) and wus(t) and suitabl 
normalizations are made, a result identical to (3) may be 
readily shown to exist. Thus, the function x12 (7, A) i? 
defined by 


| 
: 


OCIA) =| an (us(t + De?t*** de (5). 


which, when written in terms of the corresponding fre- 
quency functions U,(f) and U,(f) is 


mr d= f WOTG+ ae a. — @ 


Inasmuch as xi. is the Fourier transform of w,(#) 
ux(t + 7), Parseval’s theorem may be conveniently used 


to find the volume under the squared amplitude of x2; 
thus 


i | as AN | eA i | u(tus(t + 7) |? dt 


so that 


i. Ip | x12(7, A) |? dr dA 
7 ie | wi(é) |? dé- ‘ 


¢ + 7 in the last integral. 


_ lute) Pde 


where v = 


®W. M. Siebert, “A radar detection philosophy,’’ IRE Trans. 
oN INFORMATION THEORY, vol. IT-2, pp. 204-221; September, 1956. 

4 W. M. Siebert, “Studies of Woodward's Uncertainty Function, a 
Res. Lab. of Electronics, M.I.T., Cambridge, Mass., Quart. Progr. 
Repiy pp. 90-94; April, 1958. 

'R. M. Lerner, “Sionals with uniform ambiguity functions,” 
1958 IRE Narionan Convention REcoRD, pt. 4, pp. 27-36. 
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| ie | w(é) |? dt = ‘he | w(t) |? dt = 1 (8) 
volume under |x,2|° becomes 
2 pe 7 
il I l xXae(t, A) (Aa hace WN ao ab (9) 


The cross-ambiguity function (5), and the invariant 
lation (9), may be useful in the study of waveforms 
r communication systems wherein it is desired to identify 
a receiver one of many possible transmitted waveforms. 
such systems it is desirable to have waveforms which 
n be identified correctly, with high probability, despite 
me shifts and frequency (Doppler) shifts which may 
ot be known at the receiver. Thus the distribution of this 
mmbined correlation function of waveforms over the 
me-frequency plane is of interest. 

| 


IIT. INVARIANT RELATIONS FOR THE REAL AND 
IMAGINARY PARTS OF Xj2(T, A) 


It will now be shown that invariant relations similar to 
) also exist for the real part and for the imaginary 
urt of x12 In those cases for which the complex time 
metions u,(¢) and w(t) are obtained from real time 
metions by means of the Hilbert transform.’ The real 
rt of x12 1s defined by 


S127, A) = Slxu2(r, A) + xib(7, A)] (10) 
ad the imaginary part by 
Hr, A) = 2 (x80, 8) —xn(r, ED 
‘hen written in terms of ¢,. and £2, (9) becomes 
[fhe + Plaraa = 1. (12) 
rom (9) and (10) the volume under ¢7, is 
f is Gor, AsdrdA 
<1" [od txt? +2 |e) dr da 
PLL [atest 
ee ‘ i ie x2(r, A) dr dA. (13) 


If the second integral on the right is written in terms 
‘ the frequency (spectrum) functions it becomes 
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ith the energy in the waveforms normalized to unit ne es 
gy Y / / x*2(r, A) dr da 


= : [ F dr ia} if % UAPUAG + Ale"! df 
f UMMUxg + ver ay} 

-f- da} [_ [ coo + av 
Ug + A) (-f — 9 iy} 

=f vwyur-pa 


f _ UMA + PUA — fda (14) 


where 6(f) is a unit impulse function at f = 0. 
For the case in which u,(t) = x2(t) + zy2(¢) is a Hilbert- 
derived function, 7.e., 


yw) = +f Bas, (15) 


the value of the double integral in (14) is zero. This 
statement follows from the fact that in this case 


U2(f) < | Y 
2X,(f) 


ew 
jr 0 


(16) 


where X.(f) is the spectrum function associated with 
x(t). Thus, with the exception of the point f = 0, the 
two factors U.(+ f) and U.(— f) in the integration with 
respect to f are nonoverlapping, 7.e., where one 1s nonzero 
the other is zero. The integration with respect to A 
produces a third function of f, which, being finite,” cannot 
produce a nonzero value in the f integral when the product 
U.(f) U.(— f) is everywhere zero except possibly at the 
point f = 0. The case in which «,(¢) has an average value, 
7.€., an impulse occurs in X,(f) at f = O is excluded. 

Identical arguments can be made for the first integral 
on the right in (13), so that 


i) il yeaa 


=| / CG Nin iain. A) 


6 It is easily shown that {_.° Ui*(A + f)Ui*(A — f) dA is finite 
since Ui(f) belongs to L? if wi(¢) does. If Ui*(A + f) is denoted by 
wi(A) and Ui*(A — f) is denoted by y2(A), then from Schwartz’s 
inequality 


[ : Y,(A) pa(A) dA ; 


z Ae | vi(A) |? da iL | vo(A) |? da. 


But both integrals on the right exist since Y; and y» belong to L?, 
hence | ee Wie dA |? and JS_ Yiv2 dA are finite. 
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The volume under ¢j, 


time functions is now 


for Hilbert-derived complex 


From (12) it follows that 
[ / E(r, A) Opn = 3. (19) 


Eqs. (18) and (19) are the desired invariant relations 
for the real and imaginary parts of xi2, and it is seen that 
they contribute equally to the volume under |x12|" in the 
case of Hilbert-derived complex time functions. 


IV. INVARIANT RELATIONS FOR THE “‘DISTANCE”’ 
BrEtrwEEN Time Funcrions 


In the problem of waveform identification, which was 
mentioned in Section II, the concept of ‘distance’ 
between two waveforms is also a useful one. For equal 
energy signals, the cross-correlation and distance functions 
are equivalent; however, the distance concept is somewhat 
easier to interpret geometrically. Thus if signals have a 
finite time duration 7 and a frequency bandwidth W,’ 
they may be represented as points in an N 2TW 
dimensional space. Presumably, the separation of these 
points in the space would be optimized in some manner 
in the coding procedure for a common time origin. Timing 
may not be known at the receiver, however, and it is 
then necessary in the identification procedure to compare 
the received waveform with all the reference waveforms 
for all possible time shifts. These shifts may be interpreted 
geometrically as a relative motion of the signal points, 
and in this motion the distances can be changed radically. 
The picture may be further complicated by frequency 
shifts, additive noise, and random multipath effects. An 
objective in the coding procedure, then, is to choose the 
signal points so that they will not be moved close enough 
to each other to cause errors or ambiguities in their 
identification at the receiver. 

In this section, the distance, or more precisely, the 
squared distance, between two time functions is defined 
to be the integrated squared difference between them. 
This distance is a function of the relative time shift + 
and frequency shift A, and for finite energy signals is 
given by 


dats, A). = i | w(t; 0) — w(t + 7; A) |? dt 


[bls 0) = wale + 75 ante; 0) 


— ust + 7, A)l dt (20) 


where A is placed after the semicolon as a parameter of 
the time functions. 


7 These conditions are not incompatible if finite time duration 
means that the sample values of the time function are zero outside 
some time interval. 
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It will be useful in the manipulation of this equation to 
express u(t) and w2(t) in terms of their amplitude and 
phase functions, 


(21) 
(22) 


u(t) = ae" 
s(t) = Opens 


so that a time-shifted and frequency-shifted signal 
becomes 4 


u(t +7; NS a(t ae pyphle iar rae 


The four terms arising from the expansion of (20) are now 


(23) 


foo) 


‘| ist: OWE Od Son ON0) 


= the | u(t) |? dé = 1, (24) 
ie u(t + 7; Ajus(t + 7; A) dt 
— ie a(t + aes eee ORR aes 
se Cae ui ee ee dt 
=f cer nat 
= x2.(0, 0) = 1, (25). 
[ vste; ne + 2; a) at 
Ce 
= [ anla)ayle — nertero-eerrer 88 ay 
ee i? ut(au(x — re?"*** dx 
= xt(—7, — A); (26) 


i: 


ut (t; O)u(t + 7; A) dé 


= Xei(— 7; 7 A). (27) 


(conjugate of above) 


. 
. 


In view of (24)—(27) and (10), (20) for d7, (r, A) becomes — 


dir, A) = X11 (0) = X22(0) 
<< Xai =F; pa A) a Naa ts er A) 
=2- 2a ae Aye 


The invariant relation for ¢7, (18) applies equally well 


(28) 


for ¢3,, hence an invariant relation involving d?, is readily © 


obtained: 


io) fo) P47 2 
i if (1 ~ fae 4) dr dA = 3 


(29) 


59 


hich relation, again is valid for Hilbert-derived complex 
me functions. 

One might inquire if the volume under d?, is invariant 
ith respect to choice of time functions. In this con- 
ection it will be convenient to consider the mean value 

dj. over the plane, which is 


ho(T, A) = him or vara Jha ihe dio(r, A) dr dA 


| 
| Wo 
7 


ee any a ‘iS Dea 


Woo 


— §n(—7, — A)]-dy dA. (30) 


‘he mean value of £3, (— 7, — A), which appears on the 
jeht- hand side of (30), is certainly the same as that for 
b (7, A), and may be found in the following manner: 


a(t A lim rw 7 W 


Woon 


le ie Galt, A) dr dA 


| ee 7 Te af ae 
[ fuo(tu*(é + re ?7**! 


+ uk(t)u,(t + re?™**"} at 


Et ees a 2 
Slit a ip ; i _ tule + 9) 


Wou 
Be Seni see a ae (31) 
Yow by the Fourier single-integral formula,* 
f(x) = lim f f(y PEO — Dy, 
Woo W(x =a t) 
he integration with respect to ¢ yields 
Lim f fra(turt(t + 7) + whut + 9} PPE a 
Wo —o 
= {u(O)ut(r) + us(0)u(7)} (32) 


. C. Titchmarsh, “Introduction to the eed 


8 For example, E 
; Oxford University Press, New York, N. Y., 


{ Fourier Integrals,”’ 
nd ed., see. 1.1; 1937. 
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where the term in braces corresponds to f(t) and x is set 
equal to zero. The mean value of £,2(7, A) in (81) is now 


ay aT 1 
tals, A) = Lim STW 
W© 


fe! 
fh usual) + wh Ou(} dr. 83) 
For time functions having finite mean value, which 
includes all cases of practical interest, it is apparent that 
the value of {.,(7, A) is zero in (83), 7.e., 


Rae lim ae |. fi Cie Wh) dp ie OD 


Wo 


In view of this result, the mean value of dj, over the 
plane is simply 


Ga Ww 
2 yy a ied ii ames 


Woo 


= 2. 


I 


dix(t, A) (35) 


Eq. (35) is the desired invariant relation for the distance 
function for complex time functions having finite energy 
and finite mean value. 

In the case of wide band signals of long duration, the 
result contained in (34) is approximately satisfied over a 
finite rectangle 27 by 2W, 7.e., the contribution of 
tx: (— 7, — A) to the integral in (30) is small compared 
to the contribution of the constant term of unity in the 
integrand. Thus, an approximate expression for the 
volume under the distance function is, 

/ / @(r, A) dr dA FY STW. (36) 
This volume is seen to be determined entirely by the 
number of dimensions of the signals. Ideally, then, to 
make the best use of this volume, the coding procedure 
should place the ridges and peaks of the topography of 
d;, in the regions of the expected trajectories of the 
shifted signal in the 7, A plane, and it should place the 
valleys outside these regions. 
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On Upper Bounds for Error Detecting and Error 
Correcting Codes of Finite Length’ 


NELSON WAX7 


Summary—Upper bounds for error detecting and error correct- 
ing codes are obtained in this paper. One upper bound is found 
by exploiting the geometrical model of coding introduced by Ham- 
ming. The volume of an apprepriate geometrical body is compared 
with the volume of the unit cube, in getting the first upper bound. 
An improvement on this upper bound can be found by introducing 
a mass density function, and comparing the mass of the body with 
the mass of the unit cube. 

A comparison is made with known upper bounds, and with 
best codes found thus far. The improved upper bound given here 
is frequently somewhat smaller than previously known upper 


bounds. 
A construct codes which will enable one to detect 
or correct errors on reception of a set of trans- 
mitted symbols. These attempts have been only partly 
successful; codes have been found which will be error 
detecting or correcting for particular lengths of the blocks 
of symbols used and for a highly restricted number of 
errors. No general procedures for obtaining codes have 
been discovered as yet, nor is it known, except in par- 
ticular cases, whether a code is optimal or not. 

Upper bounds for optimal binary codes of finite length 
have been found,’ but these bounds are known to yield 
crude estimates of the number of possible code points in 
some cases, where the optimal codes have been determined 
by explicit enumeration. The purpose of this paper is to 
obtain a new upper bound for error detecting and error 
correcting codes of finite length, without regard to 
whether the codes are systematic or not. 


I. INTRODUCTION 
NUMBER of attempts’ ~ have been made to 
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Definitions and a review of previous pertinent work is 
given in the next section. One upper bound for the number 
of code points is obtained in Section III, and a further 
improvement on this bound is given in Section IV for 
binary codes. Section V contains a generalization of the 


previous results to codes using any number of elementary 


symbols. A number of mathematical details concerning 
the computation of a certain n-dimensional volume are 
given in the Appendix. 


Il. DEFINITIONS AND REVIEW 


Given a set of B elementary symbols a, - 
ordered sequence of n symbols (an ordered n-tuple) is 
called a “letter of length n.’’ If the symbols are 
independent, then a totality of B” letters, termed the 
“alphabet,” is possible. The usual case treated occurs 


when B = 2, the two symbols (digits) being 0 and 1; if. 


the letters are of length n then there are 2” possible 


letters in the alphabet. Attention here, and in the next 


two sections, will be confined to the case where B = 2. 
An error is defined as a change of a “0” to a “1”, or 
vice versa, in any position of a letter. Multiple errors are 


to be considered as single errors in two or more different 


positions of a letter. 


The occurrence of an error transforms one letter to- 


- , Gg, al 


another; if all the 2” letters are allowed then there is no 


way of deciding that an error (or errors) has occurred. It | 


is thus necessary to select a subset of the set of 2” letters in- 


order to detect the presence of an error. In order to correct 


an error, the position of the error must be known; the 


condition is also sufficient in the binary case. Any subset 
of the set of 2” letters which permits one to detect or 
correct errors will be called a set of code points, for 
geometrical reasons to be discussed later. The subset (or 
subsets) with the greatest number of members, which 
will allow one to detect or correct a given number of 
errors, will be termed an optimal code. 

Let m digits of a letter of length n be arbitrary, and let 
the remaining k = n — m of the digits be fixed functions 
of the m digits, whose values, however, are either 0 or 1. 
If all the letters of a set of code points satisfy this con- 
dition, the set is called a systematic code. 

An upper bound on the number of code points for binary 
systematic error correcting codes has been obtained by 
Hamming.’ Let a maximum of e errors be possible in any 
letter, then by counting the number of ways in which the 
k digits can be used to correct 0, 1, 2, --- , e errors one 
finds that an upper bound, y,(n, e), on the maximum 
number of code points, N,,(”, e), is given by 


d9 


27 A 
Sy =yln,); (0) 
ee men) 

é 


iis is Hamming’s upper bound. A few codes have been 
und for which N,,(n, e) is equal to the upper bound, but 
is known also that in some cases v;,(n, e) is ecttainable: 
A geometrical model, also introduced by Hamming,’ 
akes it possible to discuss codes in a general way, with- 
t regard to whether they are systematic, or not. Any 
‘dered n- -tuple of digits may be thought of as a point in 
‘dimensional space with each digit representing the 
ulue of a different coordinate of the space; if the digits 
‘e binary digits then the point is a vertex of a unit cube. 
e alphabet of letters represents all the vertices of a 
ut cube, and the set of code points is a subset of these 
rtices. 

Hamming defines a metric, D, between two vertices as 
ve number of coordinates for which the vertices are 
ifferent. Note that the Hamming distance, D, is just the 
juare of the Euclidean distance, d, between vertices. If 


Nn, @) S 


single error occurs then one letter is changed into 


nother letter, and D = 1 between these two letters. 
ow if the set of code points is chosen such that every 
tter is a Hamming distance D > 2 from every other 
tter in the set, then a single error in any member of the 
st will be recognizable, for the letter will be transformed 
» a letter not in the set. The single error will thus be 
etectable. If the code points are chosen such that D > 3 
yr every pair of letters in the set of points then a single 
‘ror will be correctable. Codes with a minimum distance 
} = 4,5, --- between points allow detection or correction 
f multiple errors. The determination of optimal codes is 
lus equivalent to finding the subset (or subsets) with the 
reatest number of points, which satisfies the minimum 
istance condition. 
_A distinction is made between even and odd values of 
) in obtaining upper bounds. Hamming” surrounds each 
ode point by a sphere of radius Ry = (D — 1)/2 = 
hen D is odd, and counts the number of vertices within 
nd on the sphere, thereby obtaining yy(n, D) = 
un, € = (D — 1)/2), as given in (1). 

If D is even, then 


Rani Dp 21) wN. in Di 20) Ge 12) 


; derived, ¢ being a positive integer, and used to find 
pper bounds for even D by noting that v,(n — 1, e = 
D — 2)/2>N,(n — 1, D — 1) = N,,(n, D). 


Aw Upper Bounp on THE NUMBER OF CoDE 
Points (THE Harp SPHERE MODEL) 


Oe 


The geometrical model introduced by Hamming, and 
iscussed in the previous section, may be used to obtain 
n upper bound, »,(n, D), on the number of code points, 
yr any distance D. This will be done here, and an improve- 
1ent on this bound, v.(n, D), will be obtained in Section 
V. 


Consider a set of code points, whose members are 
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denoted by the symbols 7;, where 7; = (a,”, bay 
the x," are either 0 or 1. Imagine a sphere of radius R 
with center at 7; to surround each code point. 

The spheres may touch, but they are not permitted to 
intersect each other. Thus the distances between code 
points are greater than or equal to 2R. (All distances, 
unless stated otherwise, are Euclidean distances.) One 
may think of the spheres as “hard,” that is, as being 
composed of an impenetrable, incompressible material 
whose mass density is unity, throughout the sphere. The 
interstices between spheres can be considered to be free 
of matter (mass density zero). 

Picture now the spheres of radius R placed at the code 
points, and the entire structure, supposed rigid, sawed 
with the spheres fixed in position, along the faces of the 
unit cube. 

The portion of each sphere contained within the unit 
cube will be called a truncated spherical element. The 
truncated spherical elements will all have the same volume, 
because of symmetry. It is convenient to compute the 
volume of the truncated spherical element whose apex 
(the point which was originally the center of the sphere, 
before truncation) is at the origin, (0, 0, 0, --- , 0). 

The volume of the truncated spherical element, V (n, 2) 


is given by 
Vin, R) = [fae 
Q 


where © is the set of points which is the intersection of 
the set s, and the set c,, namely 2 = s, (\ c,, with 


Oke, (3) 


= the set for which 0 < >> a; < R’, 
j=1 
and ¢; =the set for which 0: < 4, <1) 9 =a ea 
It is shown in the Appendix that V(n, R) may be 
written as 


n/2—-1 
iG = @ oe met ue 


where C(é) and S(é) are the Fresnel integrals. The integral 
on the right-hand side of (4) may be evaluated numerically. 

The connection between V(n, R) and »,(n, R), a bound 
on the maximum number of code points, can be established 
by comparing the volume (or mass) of the truncated 
spherical element with the volume (or mass) of the unit 
cube. One has 


[C@ + WE] dé (4) 


eal 
Vin, R) 


where N,,(n, R) is the maximum number of code points. 

Various values of V(n, R) have been computed using 
Illiac, the University of Illinois digital computer. The 
corresponding values of »,(n, R) are given in Table I 
together with Hamming’s bound y(n, e) for comparison. 
It can be seen that »; > vy for all R. 

The comparison with v,(n, e) is effected by relating & 
to e. Since the Euclidean distance between the centers of 
the spheres is 2R, then R = d/2 = D’”’/2, or R® = D/4. 


vi(n, RK) = 2 N,,(n, R), (5) 
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TABLE I 
CoLLECTED RESULTS 
a R 0.75 ies 1.75 2.25 2.75 3.25 
\= 
Qe 1 2 3 4 5 6 
n oS 
te ewBy 74.2 12:4 4.1 
Vy 16 4.4 2 
V» 13e2- seh ik 
NG 16 2 2 
8 199.3 25.6 Tel 
vi 28.4 6.9 2.8 
py 20.3* 5.3 3 
Neg 20me 4 2 
9 n 56.7 56.2 12.9 4.8 
Viz Dila2 iti i 3.9 2 
ve 39.7% 9.2 Aeo a, 
Nez 38** 6 2 2 
10 1.69 X 103 128.7 25.0 7.6 
vi 93.1 1833 5.8 ai 
V; Song 16.6 6.1 3.4 
Ne 68** 12 2 2 
Lia 5.29 x 103 312.5 50.6 13.5 5.5 
Vy WiOea 30.6 8.8 3.6 2s 
Vy 154.8* 26.5* 8.6* 4.3 a8 
Neg 128 24 4 2 2 
Be 1.72 X 104 781.2 106.9 25.5 ates 
vi 315.1 51.8 13.7 5.2 2.6 
v2 346. 8* 46 .7* 17% 5.7 ue 
Np 256 24 4 2 2 
13° 5.84 < 104 2.05 X 103 235.8 48.9 15.2 6.3 
vy 585.1 89.0 alot Wess 3.4 2 
Vy 806* 85.1* 19.3* 72° 4.0* 2.9 
Ne 512 32 8 2 2 2 
144 2.05 X 105 5.56 X 103 537.6 97.6 26.8 10.0 
vy 1.09 X 108 154.6 34.9 meres 4.7 2.5 
ve 1913* 160* 30.1% 9 .8* 4.9* 3.3 
Ne 1024 48(?)t 16 4 2 2 
Wh 7.44 X 105 1.55 xX 104 1.27 X 108 201.6 49.3 16.6 
vy 2.05 X 103 270.8 56.9 16.9 6.6 Be 
vy 4656* 309.8* . 48 .9* 13 .8* 6.2* 3.9 
Nz 2048 32 4 2 2 
160 9y 2.78 x 106 4.47 X 10! 3.11 X 103 431.0 94.0 28.5 
vy 3.86 X 103 478.4 94.0 26.0 9.5 4.4 
V» 1.16 x 104 617 .3* 82.0* 24r9 8.8 4.7 | 
Np 2048 32 2 2 | 
3, | a 
172 oy 1.07 X 107 1.33 X 105 7.81 X 103 951.5 185.2 50.7 ‘ 
viz 7.32 < 103 851.1 157.2 40.8 13.9 6.0 
vy 2.96 X 10% 1264* 140.0* 37.5 1201 5.9 
Nz 4096 64 4 2 


», = the upper bound obtained from the “hard sphere’ model. 
v, = the improved upper bound, using the ‘‘soft sphere’’ model. 
vy = the Hamming bound. 


Nz = the best value actually found. These entries have been taken largely from Laemmel,® except for entries marked with a double 
asterisk. 


* These entries have been computed using (2). All other »2 entries have not used (2). 


** These entries are corrections to Laemmel,® as given by Golay?: they were brought to the author’s attention by Dr. B. Elspas. 
} This entry is taken directly from Laemmel (6), namely ‘48 (?)’’. 


og 


uwthermore D = 
Sarge 


2e + 1, in an e error correcting code; 


(2e + 1)”?/2. 


’, ANOTHER Upper Bounp (Tue Sorr Spuere Mops.) 


It may have been noticed, on examination of Table I 
at v, is closer to vy, for large R (entries on the right-hand 
de-of the table) than for small R. These results occur 

cause the interstices between the truncated spherical 
olume elements occupy a large fraction of the volume of 
ne unit cube when F is small, and a much smaller fraction 
f the unit cube when F is large. A method of filling up 
he interstices is needed which will permit one to improve 
he bound y(n, R). This is done in this section by adapting 
»chniques developed by Blichfeldt? and Rankin’® in 
heir studies in the geometry of numbers. 
The volume (or mass) of a bounding cube was compared 
‘ith the volume (or mass) of a truncated spherical element, 
1 obtaining »,(n, R), and the truncated spherical elements 
rere assigned a unit weighting function (filled with 
mpenetrable matter of constant mass density unity). 
Yonsider however, another system in which the set of 
riginal spheres, S;, is replaced by a new set of spheres, 
‘>, fixed in the same positions as the spheres, S,. Let the 
pheres, S2, be: 1) of larger radius than the spheres S,, 
) filled with matter which is penetrable, but whose 
variable) mass density is never greater than unity at any 
oint in space, regardless of how many of the spheres S, 
verlap. The system of spheres, S;, thus fills the interstices 
‘hich were between the spheres, S,, and one can improve 
he bound »,(n, R) thereby. 

First, the spheres S, are taken to be of radius R, = 
\/2R,, where R, was the radius of the spheres S;. Next 
_ weighting function (mass density function) p(é“’) is 
ssigned to each point. 
pLlet r= (1, to, --- , 
ny point, and let 


Ghee |r — 7? | = {(2a, wt iN alle eect 8 (2p, —_ Bee yea 


epresent the distance between any point and the jth 
ode point (the center of the jth sphere). One defines 


i(é) by 


x,) represent the coordinates of 


pea i 0 Sy? = DR, 
(7) 2 =a . 
eV ene RG) 
Ais 0, a = ips 


This is one of the weighting functions used by 
slichfeldt? who has shown that the total mass density 
t any point is less than or equal to unity, 7.e., 


PLES hk 


9H. F. Blichfeldt, “The minimum value of quadratic forms, and 
1e closest packing of spheres,’’ Math. Ann., vol. 101, pp. 605-608; 
929. 

10, A. Rankin, “On the closest packing of spheres in n di- 
ensions,’’ Annals of Math., vol. 48, pp. 1062-1081; October, 1947. 
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Two other weighting functions have been found to be 
useful. p2.(é) is given by 


R 
ere ) < (7) iG eee 
P2 p E 2 
2 nk, ) ny = 
2(n + 1) log (— athe 
= 0, aah = Rs 


Finally, p;, particularly useful for large n, is defined by” 
p=l, O0<? <(V2—-DR, 


ir a ). 
cate (E , 


1/2 


: So (i) n 
(V2 — DR. <¢? < (2), 
- (8) 
So _ 1a) (n + IE? 
nti (1 Qn log nk; } 
1/2 
. n Me 
(2)"n ce ch 
= 0, aw = R, ; 
where a = 2.51286 --- , is the nonzero root of 


=I+ea. 


Both >>; po(é) and >>; p3(é) are less than or equal 
to one, for all points; proofs can be found in the references. 

Finally, the system of spheres, S., is truncated, as 
before, along the faces of the unit cube. The total mass, 
Mr, of all of the overlapping truncated spherical elements 
will be given by 


mah a foe”) dz, +++ din 


Myr 


IV 


N(n, R) [ eee | o@ dx, +++ dx, 
v 


= N,(n,R)M(n,R2), (9) 


where yw is the set of points: s2 7 ¢1, 82:0 < dot. a; < R} 
and ¢, is the unit cube. p(r) is the mass density distribution 
function for a single spherical element (without overlap), 
r represents the distance from the origin, and is used in 
place of ¢ in (6)-(8), and M(n, Rz) is the mass of a 
single spherical element. NV,,(n, &) is, again, the maximum 
number of code points. 

Let the unit cube be filled with matter of mass density 
unity. Then an upper bound on N,,(n, R) is found from 


1 = My Ze N,,M(n, Rez) 


This definition for p3(é%), chosen for its relative simplicity, 
differs from that given by Rankin, but not in a significant way. 
The major contributions to the sphere’s volume and mass come 
from the outermost portions of the sphere, where the two definitions 
coincide. 
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py ii) Ate oli 
DAG cl) = Mn, Ry) = Za ee). (10) 


Various expressions for M/(n, R.) may be obtained, 
depending on the value of R,, and the weighting function 
used. 

Iie le SI then M(n, R.) can be given exactly. Note 
that yw is the 2 “th part of the sphere of radius R2, hence 


A(n, Rs) -{- fof hi 00-9 Ohp. 


e at oa + da 
= es sf ae, -++ Xp 


= (p)V(n, R,). 
If p.(r) is used then 


Vin, Rk) (0) 


2+n(V2 — 
(ie i 4 


A slight improvement on (p,) is possible for large n, 
by using (p;). One finds that 


(ps) = at — B/n + o(L)}, (13) 
where 6 = 1.91288 --- 


If R, > 1 then the only exact expression found for 
M(n, R.) has been obtained by using p,(r). It is 


9 n—1/2 
oR (2) 


ie sin £b sin (n — 1)6 + E11 — a) ( a . 
; VE 


(p2) = (12) 


Min, V/2R,) = Min, RB) = Vin, 4/2R) — 


g g 
en eA ae 
T oR (?) 
_ f° sin £b sin (no — Ea) ( A ee q 4 
ieee Fi) (14) 


with a = R2 ~/2 (4/2 — 1) and b =<R? (4/2 — 1). 
No proof of (14) is given here; (14) was obtained by 
techniques similar to those used in evaluating V(n, FR), 
as shown in the Appendix. 

The above formulation for M(n, R,) suffers from the 
disadvantage that only the crudest of the three weighting 
functions has been employed. The other weighting func- 
tions were used by integrating (9) by parts, using the 
previously computed tables of values of V(n, R). Larger 
values of M(n, R.) were obtained for small n and R, 
using p,(r), rather than p3(r), whereas the situation was 
reversed when n and R, were large. 

A set of values of ».(n, R), vz(n, R), and the maximum 
number of code points known, are given in Table I. The 
values of v.(n, R) were often improved by using (2); the 
best values are listed. 


The bounds obtained here, v2(n, 2), are smaller than 
vy(n, R) for values of R roughly in the central columns 
of Table I. These results might have been anticipated. 
When R is small then », can sometimes be attained, thus 
no improvement over vy is possible for those cases. 
Furthermore yy, is always fairly close to the actual values 
found for small R; hence little improvement is likely, in 
general. When FR is large, namely when D > 3 n or 
R > (n/6)'”’, then it is known that N,,(n, R) = 2, and 
N,,(n, ~/n/6) = 4. However vz(n, R) is not for from 
either of these values for appropriately large R. Hence 
slight improvement on vz should be expected even for the 
most refined p(r). It is only for intermediate values of R 
that v2 < vy, might, and does, occur with regularity. 


V. GENERALIZATION TO ANY BASE 


The discussion in the previous sections has been limited, 
almost entirely, to binary codes. It is possible to extend 
the results given above to error detecting codes using B 
elementary symbols, if a slightly artificial quantitative 
definition of multiple error is adopted. 

Consider a code using the elementary symbols 0, 1, 2. 
If the symbol 2, at a fixed position in a letter, is trans- 
formed to a 1, then it is natural to say that a single error 
has occurred. Suppose, however, that 2 — 0; is this to be 
considered a single or a multiple error? It is conveniont 
and, in part justifiable intuitively, to define the change — 
2 — 0 as a double error. In general, given two letters of © 


the alphabet P; = (2{", °= , 7.7 and | Pa=—@ eon 
a") with each x; one of the B elementary symbols 0, 
1, --- , B — 1. Then the transformation P; — P, will} 


be defined as an error of multiplicity e whenever 
n ; 
Pe es 
j=1 


In the binary case if e is given then the Euclidean 
distance, d, between two points is d’ = e. This is no longer 
true when B > 2, since for given e, d is not determined 
uniquely. A smallest (integral valued) d° exists, however, 
for each (integral valued) e; this smallest d, duin(e), will 
be used as the Euclidean distance associated with given e. 

It should also be noted that error correction is not 
possible, in general, if the location and the multiplicity 
of the error is specified. Thus if B = 5, and if it is known 
that a particular symbol, a 2, is in error and that two 
errors have occurred, then either a 0 or a 4 was the original 
symbol. Ambiguity still exists, and one needs to know the 
sign of the error in order to correct the error. The bounds 
found in this section are thus bounds on the maximum 
number of code points for error detecting codes. 

The possible code points are the lattice points of a 
cube of side b = B — 1, where a lattice point is a point 
with integral coordinates. All the previous ideas may now 
be applied. In particular, the volume of the cube is b”, 
each code point is surrounded by a sphere of radius R,, 
where 


(15) 


= dyn in(€) 
Ty = Soe 


ad the spheres are truncated along the faces of the cube. 
umilarly, the spheres of radius R, may be replaced by 
pheres of radius R, = +/2R,, and the density functions 
sed, as before. 

| A simple scaling argument allows one to use the previous 
sults. 

: Let V(n, R,, b) be the volume of the truncated spherical 
lement, when the cube is of side b. Then 


Vile, Bye / e. [ ax, een (16) 
Qi 


iti. = s'/ \/¢,, where s: 0X< )o"., «7 < Rig: 0 < 
SSE anal 


| Let y; = u2,, with 0 S35 Sle en, and 
» = 1/b, then 
(n, R, b) = [ vee { & -++ tn 
ry 02 
be ait a [ay dy, = OV, R/bet) 
= b'V(n, B/D), (17) 
ising the previous notation. 
Hence 
Ri (18) 
EMG, D) TVG RD) 
and similarly 
i 
v(n, R, b) > M(n, R;/b) (19) 
APPENDIX 


ON THE EVALUATION oF V(n, R) 


The volume of the truncated spherical element, V (, 2) 
was given by (3), namely 


Vin, R) = / a [ e, sate. (20) 
where 
Cre ec. 8, 10'S pa eete 
oe OA RSE GN eit le OA eT Or 


The multiple integral in (20) is difficult to compute, as 
one must determine the regions of intersection of the 
unit cube and the sphere and then integrate over the 
regions. These difficulties may be avoided by using 
the discontinuous Dirichlet factor, 6, to sort out the 
spherical region automatically. 

Let 


Ree Zp) tests Cos (: ys r') dé, (21) 


T Jo ‘5 
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then 6 = 1 whenever - 
(0) << pe a; < R’?, and 6 =0 otherwise; 
6 therefore selects the 2-"th part of the sphere. 
Consequently V(n, R) can be expressed by 
QF ae a ‘sin ER? 
eile Pee 
G ) wT Jo [ 0 E 
“COS (: MS x4) dz, --- dz, dé (22) 
‘ fo) iv dee 2 
= = Re | ii ee 
us ft) 0 ) E 
-exp (ie De 1) dx, +++ dx, d& (2) 
j=1 
a2 ve sin ER° 
WT Jo E 
1 n 
Re { / exp GE) i} dk. (24) 
0 
Since 
1 oa 1/2 u fa 
/ exp (iéa") dx = (z.) [ exp (: 5 ‘) dt, (25) 
where u = (2&/r)'””, and since 
a har fe 
‘| exp (i% e) dt = Cf) + iS = 2 i J —1/2(n) dn 
0 0 
+5 | Suna) an, 26) 


with C(é) and S(é) the Fresnel integrals, one has that 


n/2—1 © e 2 Y 2 n 
vn, =(E) f Re Hey oo dé, (27) 


which is (4). 
It is convenient in determining V(n, R) numerically 
to let 


Cg) + iS) = Ae’, 
then 


n/2-1 Counns 2 n 
Hae 2) i sin ER’ A” cos no de. (28) 


2 ere 


A truncated approximation to (28) was used for the 
computations. The Fresnel integrals were generated from 
their series expansions, and the functions A/+/é and ¢ 
computed, within Iliac. These functions were then used, 
as indicated in (28), to obtain V(n, R). 

Two arithmetic checks on the accuracy of the numerical 
integrations are available immediately. 

If R < 1, then (28) should represent the 2 “th part of 
a sphere, namely 


2 stad fe! 
rd + n/2) ’ 


Vin, R) = (29) 
where I'(v) denotes the Gamma function. 

Eq. (29) has been used to check the numerical inte- 
grations. The results were accurate to 10 decimal places 
when & = 1 (forn < 20). 

The volume of the truncated spherical element is the 
volume of the unit cube, whenever R’ > n. This affords 
another check. 

The numerical carried out to 


integrations were 
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R’ = n + 3 in order to test the accuracy of the methods, 
and were found to be correct to at least 8 decimal places 
(for Ns 20). 
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The Probability Density of the Output of an RC Filter 
When the Input Is a Binary Random Process’ 


J. A. McFADDENt+ 


Summary—A new method is given for obtaining the probability 
density of the output of an RC filter when the input is a stationary 
binary random process. The axis-crossing intervals of the input 
are assumed to be statistically independent and identically dis- 
tributed, but with an arbitrary density function. The method in- 
volves a linear integral equation which can be reduced by Laplace 
transforms. A new family of solutions is given which includes two 
previously known cases: the random square wave of Poisson type, 
and the periodic square wave with random time origin. The general 
result of this family is given in terms of tabulated functions. For 
other solutions, a recursive technique may be necessary. 


INTRODUCTION 


HIS PAPER treats the probability density of the 
diss of an RC filter when the input is a binary 

random process. Previously, Wonham and Fuller’ 
gave the result for the case in which the input is a random 
square wave of Poisson type, 7.e., the random telegraphic 
signal. For the same input, McFadden” has obtained the 
solution when the system is an ideal integrator with 
finite memory. Again, for the random square wave of 
Poisson type, but with an arbitrary linear system, methods 


* Manuscript received by the PGIT, June 22, 1959. This work 
was supported by the Office of Naval Research under Contract 
Nonr-1100(15). ot Papa 

+ School of Electrical Engineering, Purdue University, Lafayette, 

nd. 
: 1W. M. Wonham and A. T. Fuller, “Probability densities of 
the smoothed ‘random telegraph signal’,’’ J. Electronics and Control, 
vol. 4, pp. 567-576; June, 1958. 

2 J. A. McFadden, “The probability density of the output of 
a filter when the input is a random telegraphic signal,” 1959 IRE 
NATIONAL CONVENTION RECORD, pt. 4, pp. 164-169. 


have been given by McFadden*® and by Wonham.* — 

In the present paper, the system is again specialized 
to the case of the RC low-pass filter, but the input is 
generalized. The input considered here is a stationary 
binary random process, the axis-crossing intervals of 
which are statistically independent and_ identically 
distributed. The distribution of the lengths of these 
intervals is arbitrary. (It should be noted that this input 
is not a Markov process.) 


DERIVATION OF THE Basic HQuaTIONS 


The differential equation governing the given system 
is the following: 


dy(t) 


oD = yo) + 7 EY, 


(1) 
where x(¢) is the input, y(¢) is the output, ¢ is time, and 
T = RC, the time constant of the system. 

Let the input assume only the values + 1, with either 
value being equally probable. Let ¢ be a particular 
instant at which x(t) changes from negative to positive, 
and let 7, be the next instant at which x() changes from 
positive to negative. Let y(t) = yo and y(4i) = wy; 
Yo < Yi, Since x(¢) is positive during this interval. The 
probability density of the maximum amplitudes is 


’J. A, McFadden, “The probability density of the output of 
a filter when the input is a random telegraphic signal—differential- 
equation method,” IRE Trans. on Inrormation Turory, vol. 
IT-5, pp. 228-233; May, 1959. 

4W. Wonham, private communication. To be published in 
J. Electronics and Control. 
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.(y,), and the probability density of the minimum 
mplitudes is Q_(yo). By symmetry, 


Q.(%) = Q-(—y)- (2) 


From the solution of the differential equation (1), the 
mplitudes yp and y,; are related by 


(3) 


rhere r = t, — ty, which is the length of the appropriate 
axis-crossing interval of the input. The probability 
density of these intervals is Po(r), which is assumed to 
be known. 
Now, if successive axis-crossing intervals of the input 
are statistically independent, yo and 7 will be independent, 
but y, and +r will be dependent. The joint probability 
density of yo and 7 is Q_(yo)Po(r). By (8), the variable 
Yo may be eliminated, and it follows that the joint density 


of y, and 7 is 


| : 
: par ieee Poa): 

| Integration over the possible values of 7 will yield the 
marginal density Q.(y,). The upper limit for this inte- 
gration is obtained from (3) when y) = — 1. The result 
for Q.(y;) is the following: 


w= 1—-—d—ywe”’, 


T log [(2/(1-1)] 


ay) = | &"Po(1) 


-Q[l-—(l—ye""]dr. (5) 


Thus, if Po(r) is known, the probability densities of the 
maxima and minima can be obtained, in principle, from 
the solution of this integral equation, under the restric- 
tion (2). 

The next step is to calculate the probability density 
of the output at an arbitrary instant in terms of the 
densities of the maxima and minima. Let ¢ be an arbitrary 
instant between f) and ¢,, 7.e., during the period in which 
the input is positive. Let the amplitude of the output at 
that instant ¢ be denoted simply by y. The probability 
density of this amplitude is denoted by P.(y). fl = t — fo, 
then y is related to the minimum amplitude yo by the 
relation, 
ce (6 
Then, by an argument similar to the previous one, the 
probability density of y(t) is related to the density of 
minimum amplitudes by the following integral: 


y=1—(1— xe 


T log [2/(1—-y)] 


Pig) - | e/7Q(DQ-[1 — 1 = Ye”) al, (7) 
0 

where Q,(J) is the probability density of the time difference 

t — fo, where ¢ is fixed but f is random. 

It appears, therefore, that if Q-(yo) can be determined 
so as to satisfy (2) and (5), then the density P,(y) of 
the amplitude at an arbitrary instant (while z is positive) 
can be determined from (7), provided Q)(/) is known. 
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There remains the task of calculating Q,(/) in terms of 
the density of axis-crossing intervals. 

The probability that the first axis crossing before f 
occurred between f, and t, + di is Q(1)dl, since 1 = t — fp. 
The same event may also be described as follows: the 
probability that an axis crossing occurs between ¢) and 
ty + dl and that the next axis crossing occurs after time 
tis” 


eat | ae (8) 


where 6 is the expected number of crossings per unit 
time in the input. Then, by equating these two prob- 
abilities, 
Qo) = 8 [| Po(r) ar. (9) 
1 
The problem is completely formulated for the de- 
termination of P.(y) in terms of a given function P,(r). 


By symmetry, the probability density of the amplitude 
at an arbitrary time for which the input is negative is 


P_(y) = P,(—y). (10) 


Finally, the probability density P(y) of the amplitude of 
the output at any given instant, regardless of the sign 
of the input, is the mean value. 


Py) = 7/P.Y + P_YI. 


REDUCTION BY LAPLACE TRANSFORMS 


(11) 


The problem will now be reduced to a simpler one by 
means of Laplace transforms. Let the amplitude variables 
be transformed as follows: 


eee Qe-/T 


ee ee 


(12) 
Y = 

Furthermore, let the 

redefined as follows: 


Q.y) = HA); Q-) = H(A); 
P.(y) = GQ). 


Then the integrals in (5) and (7) become convolution 
integrals and the Laplace transform may be employed. 
Let 


probability density functions be 


(13) 


L{H,A)} = hls); £L{H(A)} = hls); 


LIGA} = nu; (14) 
L{Po(A)} = pols); L{Q(A)} = als); 
where 
&{FQ)} = it * EMF) ad. (15) 


5 J. A. McFadden, ‘The axis-crossing intervals of random 
functions II,’’ IRE Trans. on Inrormation THeEory, vol. IT-4, pp. 
14-24, (76); March, 1958. 
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Under these transformations, (5) and (7) become the 
following, respectively: 


h,(s) (16) 


1 , 
pol >: nO 


an(s — a)hals). 


Furthermore, by an elementary calculation, the trans- 
formation of (9) yields the result 


gi(s) (17) 


go) = 2 tt — pals). (18) 
The constant 6 may also be calculated from p,(s). Since 
1/6 is the mean axis-crossing interval,” 


B = —1/p,(0). 


The expression for go(s) may be substituted into (17); 
then with the aid of (16) a simple relation follows for the 
function g,(s) in terms of h,(s) and h,(s). The result is the 
following, where « = BT: 


(19) 


n@ = Gz We) — hO). (20) 

Thus, if h,(s) and h.(s) are known, g,(s) follows from 
(20). Then the inverse transformation yields P.(y), and 
by (10) and (11), the solution can be completed. The 
principal problem remaining is that of determining 
h,(s) and h.(s). 

If po(s) is known,’ one relation is provided by (16). The 
other relation must be obtained from (2), the require- 
ment of symmetry. Under the change of variables (12) 
and (13), (2) becomes the following: 


H,(\) = H,[—T log (1 — e&””)] (21) 


or 


HQ) = Fy |=T log(h —e, 9): (22) 


If these equations are subjected to the Laplace trans- 
formation, the resulting integrals may be reduced by a 
change of variable and by the binomial expansion. Then 
it follows that 


hiG= > e ae n(® as '), (23) 
OS Doane ne ‘), (24) 


where (a), = a(a + 1) ++: (a +k — 1) and (a), = 1. 

It appears, therefore, that the determination of h,(s) 
and h.(s) requires a solution of (16) and either (23) or 
(24). 


6 Tbid., pp. 17-18. 

7 p(s) is closely related to the autocorrelation function and the 
spectral density of the input. See (30) of McFadden.® Also, J. L. 
Lawson and G. E. Uhlenbeck, ‘Threshold Signals,” McGraw-Hill 
Book Co., Inc., New York, N. Y., p. 45 (36b); 1950. 


IRE TRANSACTIONS ON INFORMATION THEORY 


December 


GENERAL METHOD OF SOLUTION 


A general method will now be given by which h,(s) an 
ho(s) may be determined when p(s) is specified. Inspection 
of (23) and (24) reveals that h,(s) and h,(s) will be known 
for all values of s if they are known for the discrete values 
s = (k + 1)/T, where k is a non-negative integer. The 
functions can be determined at these points by a recursive 
method, since the series terminate when (1 — s7’) is a 
negative integer. 

Suppose h,[(k + 1)/7] is known for the values k = 0, 
1,2, ---,n — 1. Then let s = (n + 1)/T im (24). The 
last nonvanishing term in the series can be expressed in 
terms of h.[(n + 1)/T] by means of (16). Then the 
solution for ho[(n + 1)/7] is the following: 


— (—n). (3 + ) 
(ott) _ Tee 
2 - pak 


k=0 4 
b=] lipo) 

Then, h,[(x + 1)/T] is obtained by (16) and the process 

can be repeated for the next integral value of n. The 

initial values (7 = 0) are obtained from the normalization 

of Q, and Q_. For all functions p,(s), 


h,(1/T) = he(/T) = T/2: 


(25) 


(26) 


If a general expression can be obtained for hi[(m + 1)/T] 
and the series (24) summed in closed form, hA,(s) can be- 
calculated from (16), and g,(s) follows from (20). The 
chief difficulty arises in the determination of the nth — 
term of the series. 

This section will be concluded with the derivation of 
an identity in g,(s). If (23) and (24) are substituted into 
(20) and the series combined, the terms of the series can 
be recognized as special cases of (20) for integral values 
of sT’. The final result is the following: 


il ewes 1) pO) Oat me 
gi\s) = Di, ah aE a z i 


This series can be considered as the Gregory-Newton | 
interpolation formula for a permissible solution of the | 
present problem.* | 

Note: The coefficients h,[(k + 1)/T] and h[(k + 1)/T] 
are proportional to the kth moments of (1 — y,) and~ 
(1 — yo), respectively. Furthermore, g,[(k + 1)/T] is” 
proportional to the kth conditional moment of (1 — y), | 
given that x is positive. (The constant of proportionality 
is T/2"**.) Thus, the recursive method given above is 
essentially a determination of the successive moments 
of the output. 

If, in a particular problem, a finite number of moments — 
of the output is desired (but beyond the mean square), 
then the present method is recommended over the 
standard one. The standard method requires a knowledge 
of the higher product moments of the input, and these 


(27) 


’ This observation is due to Prof. I. Marx. See E. T. Whittaker 
and G. Robinson, “The Calculus of Observations,’’ Blackie and 
Son, Ltd. , London and Glasgow, p. 10; 1924. 
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. re not generally known for binary processes.” It is far 
impler to make assumptions about P(r) than to work 
with the higher product moments. 


PARTICULAR SOLUTIONS 


The simplest method for discovering new solutions of 
the given problem is a semi-inverse one, rather than the 
direct recursive method given above. If Q.(y) and Q_(y) 
are chosen so as to obey (2), and properly normalized 
over the range (— 1, 1), the other functions will all follow 
directly if the transforms and inverse transforms can be 
calculated. 

Suppose that the probability densities of maxima and 
minima are those belonging to beta distributions, shifted 
to the proper range."° 
! 


(hb) eae 
2°" Baa+1,6+1)’ 


Post 
 _a-ydat+y) | 
Oy) = 2 BG er dl)” 


Q + (y) ie 
; (28) 


where each density is zero outside the prescribed range, 
and—Il<ac<b. 

After the change of variables (12) and (13), the corre- 
sponding functions H, and H, are the following: 


Cie A cae eety 
2Bia + 1,6+1)’ 
Pesta ee Gl Psi Creel 


RG 21 be)” 


H(A) = 


ON 29) 


A,X) a9 


The Laplace transforms of these functions may be 
obtained by a straightforward integration as in (15): 
h,(s) 2 BCLS abil) 

Se OL Gaenl beet) 
py ae ae 
OED eet eb 1) 


(30) 


These transforms may be substituted into (16), which 
may then be solved for the transform of the axis-crossing 
interval density: 


_T@+) rel +a+)), 


~ Tati) rr+b64+) (31) 


Pol) 
The interval density can then be obtained with the aid 
of tables of Laplace transforms: 


1 


Rigor ib aT Cee HG Weees cult) Sagan a2) 


P(7) = 


The next step is the determination of the probability 
density of the output. From (20), Gi(A) can be written 


9 J. A. McFadden, ‘“‘The fourth product moment of infinitely 
clipped noise,’ IRE Trans. on Inrormation Tueory, vol. IT-4, 
pp. 159-162; December, 1958. oe _ 

10 H. Cramér, ‘‘Mathematical Methods of Statistics,” Princeton 
University Press, Princeton, N. J., p. 243; 1946. 

1 Bateman Manuscript Project, ‘Tables of Integral Transforms,” 
McGraw-Hill Book Co., Inc., New York, N. Y., vol. 1, p. 261, (1), 
p. 129, (5); 1954. 
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as the convolution of the inverse transforms of the two 
factors of the right member. One of these inverse trans- 
forms is elementary, and the other is obtained from the 
difference of the two functions (29). Then, by the change 
of variables (12) and (13), the following result appears 
for the probability density of the output at an arbitrary 
instant while the input is positive: 


Y= (2a BG a 


honed eC mes earn £)"] dé, (33) 


Wil Seals 


= (J elsewhere. 


Finally, the probability density of the output at an 
arbitrary instant, regardless of the sign of the input, may 
be obtained from (10) and (11). By a simple manipulation, 
the result may be expressed in terms of incomplete beta 
functions: 


K 


Pty) = eae [LT ays) ae Orel Gase) 
el ajaa yO ts leet) | | Gaaaie so) 
= 0 elsewhere; 
where 
[ ea -pra 
Ip, = (35) 


Bip, q) 


I, is a tabulated function.” 

The constant x (= BT) is related to a and 6b as follows: 
@ can be obtained from (31) by means of (19). The result is 
in terms of the logarithmic derivative of the gamma 
function: 

+= y+) - va+), (36) 
where ¥(z) = I’(z)/T(z). 

The above solution includes two previously known 
cases. If b = a + 1, then x = 6 and (32) becomes a simple 
exponential, which is characteristic of the axis-crossing 
intervals of a random square wave of Poisson type. The 
density P(y) in (34) becomes the particular solution of 
Wonham and Fuller’ for this input. 

If a and b both become infinite, but in such a way that 
b/a remains finite, then P(r) becomes a 6-function: 


Paes i(z me rag 2). (37) 


2K. Pearson, ‘Tables of the Incomplete Beta Function,” 
Cambridge University Press, Cambridge, England; 1934. 

13 Bateman Manuscript Project, “Higher Transcendental Func- 
tions,’ McGraw-Hill Book Co., Inc., New York, N. Y., vol. 1, 
p. 15; 1953. 
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Thus, the input becomes a periodic square wave with 
random time origin. If the input is to be stationary, the 
time origin must be uniformly distributed over a complete 
period. 

The probability density of the output of an RC low-pass 
filter, when the input is such a periodic square wave, has 
been calculated by McFadden.’* The solution (84) 
becomes identical to the previous one in the limit, where 
1/x ~ log (b/a). 

For other values of a and & P(y) may be calculated 
from the tables.” For a = 5 and b = 40, the probability 
density of axis-crossing intervals of the input is shown 
in Fig. 1. This curve is suggestive of an infinitely clipped, 
narrow-band Gaussian signal. (In reality, however, such 
a signal would display a strong dependence between 
successive axis-crossing intervals," and this dependence 
has been neglected here.) For this case, x = BT’ = 0.50125. 

For the same case, the solution for P(y) is shown in 
Fig. 2. This shape approaches the limiting form of the 
periodic case, * for which P(y) is concave upward between 
upper and lower bounds (these bounds lying inside of 
y = + 1) and zero elsewhere. 

Another particular solution, not included under (34), 
can be determined from a different assumption on Q, 


T Pd 


afr 


° 1.0 2.0 3.0 


Fig. 1—Probability density of axis-crossing eae of the input 
as given by (32), when a = 5 and} = 


P(y) 


-1.0 fe} 1.0 t 


Fig. 2—Probability density of the output of an RC low-pass filter 
when the input is a binary random process, the axis-crossing 
intervals of which are statistically independent and which obey 
the probability density law shown in Fig. 1. 


4 McFadden, op. cit., footnote 2, (20). 
15 McFadden, op. cit., footnote 5, p. 22. 
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and Q_. Suppose 
Q.(y) = 1, OZ yest; 
= 0 elsewhere ; (38) 
Q2G) ee aa 
= 0 elsewhere . 


Then, by the same procedure as in the previous example, 


H,) = 1, 2 Jog 2. noo 

= 0 elsewhere; (39) 
EL (0) a lee 0O< A < T log 2; 

= 0 elsewhere. 

peda Shi 

hy(S) = aay ss (40) 

he(s) = (1 — 2°°7)/s; 

pols) = 1/(2'*** — 1); (41) 

P(r) = >> 2° d(r — KT log 2). (42) 
k=1 


This is the axis-crossing interval density for a symmetric 
coin-toss square wave with a random time origin, and the 
elementary pulse width equal to (T log 2).*° The densities 
P..(y) and P(y) may also be obtained as before: 


P.y) = «7, atl SAP IAW 
4 (43) 
= kK, Oya 
Pie ae eh 
= 0 elsewhere; (44) 
where x = (2 log 2)”*. This solution was discovered 


previously by Wonham, who used a different method.” 
No solution has yet been obtained in closed form for the 
coin-toss input with an arbitrary ratio of time constants. 


RC Hiaeu-Pass Fitter 


Any solution for the RC low-pass filter may be adapted 
to the case of an RC high-pass filter, as shown by 
McFadden.** 


CONCLUSION 


This paper has described a new method for studying 
the output of an RC filter when the input,is a stationary 
binary random process. Much work remains to be done 
concerning the properties of the solution, particularly 
the asymptotic approach to a Gaussian density. It is 
also conceivable that the method may be extended to 
other types of linear systems. 


16 J, A. McFadden, “The axis-crossing intervals of random 
functions,’ IRE TRANS. ON INFORMATION Tueory, vol. IT-2, pp. 
146-150 (5); December, 1956. 

17 W. M. Wonham, private communication. 

18 McFadden, op. cit., footnote 2, (26). 
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Recurrent Events in a Bernoulli Sequence’ 


M. B. MARCUST 


Summary—The ‘‘point of regeneration’? method is used to 
obiain simple sequential equations for determining the complete 
probability density function for multiple occurrences of events in 
a Bernoulli sequence. Both the independent and overlapping classes 
of recurrent events are included in the general framework of these 
equations. The equations also lead to the generating function for 
the probability distribution. This is used to obtain the expected 
recurrence times for the different classes of recurrent events. 

A distinction is made between the probability distributions for 
the occurrence of the kth event at the nth trial and the occurrence 
of k events in n trials. The latter case is of primary concern. 

_ The methods employed and the results obtained have extensive 
applications in problems in automatic control, communications, 
and information processing. 


INTRODUCTION 


HE techniques of functional equations, in par- 
Bess the “point of regeneration” method,’” 

will be used to formulate equations for computing 
the probability density function of recurrent events in a 
Bernoulli process. An event is defined as any _pre- 
established and distinguishable subsequence of the 
process. Recurrent events are events that repeat as the 
process continues. They are separated into two classes, in- 
dependent and overlapping. We will find the probability 
that an event will occur k times in n trials of the process 
for all k and n and for each of these classes. 

Feller® presents the generating functions for the prob- 
ability density of the first occurrence of an event at the 
nth trial. Products of these generating functions, say k 
of them, give the generating function of the kth occurrence 
of the event at the nth trial. The kth occurrence of an 
event at the nth trial and k occurrences of the event in n 
trials are different. The latter case, which was suggested 
by counting successes in an experiment of length n, is 
the concern of this paper. 

The generating function for the probability of k 
occurrences of an event in 7 trials will be found for both 
independent- and overlapping-type events. It will be 
used to derive generating functions for the occurrence of 
an independent event, as given by Feller, and for the 
seeurrence of an overlapping event, a result previously 
unknown to the author. These two functions give the 
expected value for the occurrence of each type of event. 
A comparison of the expected values shows the relative 


* Manuscript received by the PGIT, October 14, 1959. 

+ Engrg. Div., The RAND Corp., Santa Monica, Calif. ! 

17, E. Harris, ‘Some mathematical models for branching 
processes,” Proc. Second Berkeley Symp. on Math. Statistics and 
Probability, pp. 305-328; 1951. 

2R. Bellman and T. E. Harris, ‘On age dependent binary 
branching processes,” Ann. Math., vol. 55, pp. 280-295; March, 
1952. 

3W. Feller, ‘Probability Theory and its Applications,’ John 
Wiley and Sons, Inc., New York, N. Y., vol. 1; 1950. 


rate at which these events accumulate in a Bernoulli 
process. 

The significance of this paper is that it describes a 
simple and direct method for computing probability 
density functions for many important problems in 
communications. The method treats both independent 
and overlapping recurrent events and is not complicated 
by increasing the number of elements contained in an 
event. 

The analysis considers both beginning the observations 
of a sequence from its start and after it has been in 
operation. A discussion of the physical importance of this 
analysis will clarify the meaning of all of these distinctions. 
The methods will be described first in the solution of two 
particular problems and then in general. 


RECURRENT EVENTS OF A BERNOULLI SEQUENCE 


To illustrate recurrent events, consider a Bernoulli 
sequence of 1’s and 0’s. Each trial of the sequence has a 
probability p of turning up | and a probability g = 1 — p 
of being 0. The trials are independent. 

An event is any collection of 1’s and 0’s, such that at 
any trial we can tell whether or not the event has occurred. 
An example of an event might be “three consecutive 
1’s”; the trial at which the third consecutive 1 is obtained 
is the trial at which the event is said to occur. As the 
trials continue the event will, in general, occur again. 
The purpose of this paper is to derive a method of com- 
puting the probability that in n trials the event will 
recur k times, for all k and n. 

Recurrent events can be viewed in two ways, as in- 
dependent events or as overlapping events. If a recurrent 
event is independent, then after each occurrence of the 
event, the process is considered to begin newly again. 

For overlapping events, each trial is viewed to see if the 
event has occurred, regardless of whether or not it 
occurred on the previous trials. If an event is defined as 
“three consecutive l’s,” then four consecutive 1’s represent 
one event and one 1 if the events are viewed independently, 
but two events if they are viewed as overlapping events. 
Both of these cases have important physical analogs. 

In deriving the probabilities for k occurrences in n 
trials, we can either assume that the process begins at 
n = 1 or that the process is already underway and that 
at n = 1, a set of conditional probabilities already exists 
for the occurrence of the event. 


PHYSICAL EXAMPLES 


In each case of recurrent events, independent or over- 
lapping, there are important physical examples. As an 
example of independent recurrent events, we might 
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consider memory erasing processes. These could be a 
decision followed by the clearance of a shift register or a 
servomechanism that changes into an unfavorable mode 
of operation when it receives a certain signal and which 
must be turned off and begun again fresh. 

Overlapping recurrent events can characterize a system 
that needs a certain signal to begin operating and which 
will continue to operate as long as the signal is present. 
An obvious example is meteor burst communications.*”” 
Another example, one which led to this paper, was the 
analysis of a radar scheme” in which each input pulse was 
delayed and compared to the following pulse and was 
recorded only if they both exceeded a threshold. This is a 
case of overlapping recurrent events of two signals that 
exceed the threshold. 


Event or Two CoNSECUTIVE OCCURRENCES 


We will find the probability density function for an 
event consisting of two consecutive 1’s for both the 
independent and the overlapping case. For the sake of 
simplicity, first consider that the process begins at n = 1. 

We will define the following functions: 


R(n, k) = the conditional probability that in the next 
n trials the event will occur k times, given 
that the process begins at trial n = 1. 
the conditional probability that in the next 
n trials the event will occur k times, given 
that the previous trial resulted in a 0. 
the conditional probability that in the next 
n trials the event will occur k times, given 
that the previous trial resulted in a 1. 


Po(n, k) = 


Pi(n, k) = 


Consider the function R(n, k), the probability of k 
occurrences in 7 trials, starting from the first trial. If the 
outcome of the first trial is a 1, R(m, k) will be equal to 
P,(n — 1, k), the conditional probability of k occurrences 
in (n — 1) trials, given that the previous pulse was a 1. 
If however, the outcome of the first trial is a 0, R(n, k) 
will be equal to Po(n — 1, k&). Recall that a 1 occurs with 
probability p and a 0 with probability g = 1 — p. There- 
fore 


Rin, k) = pPi(m — 1, k) + gPo(n — 1, k). (1) 


If the process is in the conditional state corresponding 
to the probability P,(n, k), then according to the outcome 
of the next trial, it will go to either the state corresponding 
to P,(n — 1, k) or the state corresponding to P;(n — 1, k), 
1.€., 


P,(n, k) = pPi(n — 1, k) + oP — 1, k). (2) 


4P, A. Forsyth, E. L. Vogan, D. R. Hansen, and C. O. Hines, 
“The principles of JANET—A meteor-burst communications 
system,’ Proc. IRE, vol. 45, pp. 1642-1657; December, 1957. 

5 L, L. Campbell, “Storage capacity in burst-type communica- 
tion systems,’’ Proc. IRE, vol. 45, pp. 1661-1666; December, 1957. 

6 M. B. Marcus, “‘Analysis of a Method to Permit the Operation 
of a Pulse Search Detection Radar in an Environment of Interfering 
Radar Units,’’ The RAND Corp., Santa Monica, Calif., Res. Memo. 
RM-2375; May, 1959. 
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Therefore 


Rin, k) = Pn, k), (3) 


and since we have not yet distinguished between in- 
dependent and overlapping events, (3) applies to both 
of these cases. 

Now consider that the process is in the conditional 
state that corresponds to the probability P,(n, k) and 
that independent events are being recorded. If the next 
trial results in a 1, then an event occurs. Since independent 
events are being recorded, the process must begin again 


at the next stage; 7.e., the process goes into the con-— 


ditional state corresponding to the probability R(n — 1, 


k — 1). By (8), this is Po(n — 1, & — 1). If, however, the ] 
next trial results in a 0, the event does not take place and — 


the process goes into the state corresponding to 


P,(n — 1, k). Thus, in the case of independent recurrent — 


events, 


Pin, k) = pPom — 1,k —1) + aia — foe (4) 


For the case of overlapping events, if the process is in 
P,(n, k) and the next trial results in a 1, then an event 
occurs and the process remains in the conditional state 
where the previous pulse is a | [(7.e., if the trial results 


ina 1, P,(n, k) goes into Pi(n — 1, k — 1)]. Therefore, 


in the case of overlapping recurrent events, 
P,(n, k) = pP,(n — 1, k — 1) + @P(n — 1, k). 


The results of each of these cases will be summarized 
below. 


Independent Recurrent Events of Two Consecutive 1’s 


(5) 


_ 


Recall that R(n, k) is the probability that in n trials, — 


there will occur & events, and that R(n, k) = P,(n, k). 
In the case of independent recurrent events, we obtain 
the complete distribution of R(n, k) from (2) and (4), 7.e., 


P,(n, k) = pPi(n — 1, k) + gP.(n — 1, k) 
P,(n, k) = pPo(n — 1,k — 1) + gPo(n — 1, b). 


(6) 


This can be expressed as a single equation for R(n, k), 
Rin, bk). = Patn,&) 
= g{Pom — 1, k) + pP(n — 2, k)} 


“ip Pat — 2k ein (7) 
The initial values of Po(n, k) are easily obtained; 
P,(2, 0) ea Was p P,(2, 1) De 


Furthermore, P,(z, j) = 0,7 < 7, Py (0, 0) is defined 


equal to 1. The process is undefined for 7 < 0. These 
values and (7) enable R(n, k) to be solved recursively for 
all n and k. 
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Overlapping Recurrent Events of Two Consecutive 1’s 


Similar to the previous case, R(n, k) = P,(n, k). The 
equations that govern the case of overlapping events 
are (2) and (5), z.e., 


Pon, k) = pP,(n S33 U, k) i qPo(n ai 1, k) (9) 
~P,(n, k) = pPi(n — 1,k — 1) + gP,(n — 1, b). 


As an equation for R(n, k) in terms of P,(n, k) alone, 
this is 
R(n, k) = P(n, k) 


etic to= irk 1) — gh — 2h 1)}.. (10) 
} 2 he initial conditions are 
PO, =1 P10) =1 P20=1-P ay 


Pol, 1) = 0 Po2, 1) = p’, 

and Po(z, 7) = 0 when 7 < 7. The process is undefined 

cory <.0. ; 

| Therefore, in this case of overlapping events, R(n, k) 

can still be calculated from a single recursive equation. 
In the next section, the problem of recurrent events 

will be expressed in generality including the two cases 

mentioned here. 


GENERALIZATIONS OF RECURRENT EVENTS 


Consider an event that consists of 7 elements, in which 
each element can either be a 1 or a 0. The sequence can 
be in any of the states 7, 0 < r < j, where the state r 
designates that the last r members of the sequence com- 
prise the first r elements of the event. In this manner, the 
‘state 0 designates that the last member of the sequence 
contains no elements of the event. 

_ Associated with each state r is a probability P,(n, k); 
P,(n, k) is the conditional probability that the sequence 
will contain k occurrences of the event in the next n 
trials, given that it is currently in state r. 

In order for this problem to be deterministic, it must be 
‘stipulated that the sequence can be in one and only one 
‘state r at a given time and that given the history of the 
sequence, it is always possible to tell which state this is. 
Since the event is clearly established, the transition prob- 
abilities (p or g) of going from state r to state r + 1 are 
known. It is also assumed that if the process does not go 
on to state r + 1 it returns to state 0, unless of course the 
‘state r + 1 is the terminal state of the event. This case 
will be explained later. The stipulation that the process 
goes from state 7 to state 0 if it does not proceed in the 
pattern of the event is not necessary mathematically, 
but it is compatible with physical applications and 
generally simplifies the theory. 

We assume that at the occurrence of an event of 7 
elements, the process goes to the 7th state. For independent 
events, 7 = 0. It is seen that the solution for the process 
is equally simple for any 0 < 7 < j — 1. Notice that this 
extends the concept of overlapping events. 
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In order to simplify the expression of transition prob- 
abilities, we will consider that the event is 7 consecutive 
1’s. Since the 1’s and 0’s are independent, no loss of 
generality is imposed. We define: 


P,(n, k) = the conditional probability that in the next 
n trials the event will occur / times, given 
that the previous 7 trials resulted in 1’s. 


When the event occurs, the process goes into state 7. 
If a 0 occurs, the process returns to state 0. The equations 
describing this process are 


P.(n, k) = pPi(m — 1,k) + gP.(n — 1, &) 
P,(n, k) = pP.(n — 1, k) + gP.(n — 1, k) 


DP 52i(n a, tle k) Se gP(n = i k) 


I 


P (1, kh) 


P;-1(n, k) = pP.(n — 1,k — 1) + gPo(n — 1, k). 


These can be reduced to one equation in Po(n, k), z.e., 
P.(n, k) = ¢ Dp’ Pon — 1, k) 
r=1 
se p{Poln (el) 


—¢SPin-G-9,k-D}. 8) 


We have already established that R(n, k) = Po(n, k). 
The number of terms in the equation for P,(n, k) do not 
influence computing time. Thus, even in this general case, 
the complete probability density function can be obtained 
recursively from a single equation in Pp». 


GENERATING FUNCTIONS FOR P,(n, k) 


A generating function for Po(n, k) will be found; an 
event is 7 consecutive ones and the process returns to the 
state 7 after the event occurs. The generating function 
is defined as 


UG) = B > Pan, k) t's”. (14) 


The counting for n begins at 7 because n = 7 Is the first 
trial at which the event can occur. 

The equation for U(s, t) is obtained from (13), with 
minor skills but much labor. 


Us pee Ss Pia, dts 


n=] 


(15) 


| 1 

7 ee tl Be es qp't'**) 
I= Tages! s 
| i= tS opt 
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This can be rewritten 


U(s, t) = >> Pon, 1)t's 


> ee ee Oe 
1—t+aq't™ 


k=1 


(16) 


The generating function, as it is given in (16), is used 
to obtain important properties of recurrent events and to 
aid in calculating the values of P,(n, 1). We begin by 
considering the terms P(n, /) in detail. 

P,(n, 1) is the probability that the event occurs once 
in the first » trials. The occurrence of one event in the 
first n trials can happen in any of the following ways: the 
event occurs for the first time at the rth trial and does 
not occur again in the next n — r trials, the event occurs 
first at the r + Ist trial and does not occur again in the 
next n — (r + 1) trials, and so on for all r from 1 to n. 
Thus, the probability of one occurrence of the event in 
the first n trials is the convolution of the probability of 
the first occurrence of the event a¢ the rth trial and the 
probability of no occurrences of the event in n — r trials. 

We must account for independent and overlapping 
events. At the start of a process, there is no history; over- 
lapping is impossible for the first event. After the occur- 
rence of the first event, the process returns to the state 7. 
If = 0, the events are independent. Notice that for the 
first occurrence of the event, both overlapping and 
independent events behave as independent events. 

P.(n, 1) is equal to the convolution of the probability 
of the first occurrence of an independent event at the 
rth trial and the conditional probability that no event 
will occur in the next n — r trials, given that an event has 
just occurred and the process is in state 7. 

Let F',(n) be the probability of the first occurrence of 
an independent event at the nth trial and let W(t) be 
its generating function for all n. 

Let Po(n, 0) be the conditional probability that no 
event will occur in 7 trials, given that an event has just 
occurred and the process returned to state 2. Let V(t) 
be the generating function for {Po(n, 0)}. 

Then 


¥ Pom, NO = Wald VD. (17) 


P,(n, 2) is the probability of two occurrences of the 
event in the first n trials. The event will occur twice in 
the first » trials if the event occurs for the first time at 
the rth trial, the second time at the r + /th trial and does 
not occur from the r + J + Ist trial to the nth trial 
Q<7tan — 1,7 +1 < n)- Therefore, Po, 2) isthe 
convolution of the probability of the first occurrence of 
an independent event at the rth trial, the conditional 
probability that the event will occur next at the /th trial, 
given that the event has just occurred and the process 
returned to state 7, and the conditional probability that 
no event occurs inn — (r + J) trials, given that an event 
has just occurred and the process returned to state 7. 
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Let F;(n) denote the conditional probability that the 
event will occur next at the nth stage, given that the 
event has just occurred and the process returned to state 7. 
Let W(t) be the generating function for {/;(n)}. Notice 
that for independent events 7 = 0, both F(m) and W(t) 
are identical to the expressions for the first occurrence of 
an independent event, as we would expect. 

It follows that 


Y Poln, He" = WADW OV, (18) 
substituting (17) 
d Pol, 2)t” = 2) Poln, tw). (19) 
Furthermore, 
Pn, OE = DEP a (20) 
n=kj n=7 , 
A comparison of (16) and (20) shows that 
ot mee pee pion 
Wi) ee eee (21) 
It follows directly, with 7 = 0, that 
yy oO 
Wi) oe (22) 
Finally, because a return to state 7 — 1 is of particular 
interest, we display 
j—-1yi 
Wane pilex ergo as (23) 


il ae ae qp't'** 


The function V;(¢) is the generating function for the 
probabilities that the first 7 — 7 elements of the process 
are not all 1’s and that no run of 7 consecutive 1’s occurs 
by the nth trial of the process. It is obtained from re- 
cursion relations similar to (13). 


i= got oe 


ton ae ce 


Vi) ne 


Thus, we can obtain the complete generating function 
for U(s, t), where the process returns to state 7 after an 
event has occurred, 0 < 2 < j — 1; 


2 pel ae pt) (1 =n ae ; 
1—tt+qit” 


ss (eo eit ook eae 
1—t+qp't” : 


U(s, t) 


(25) | 


k=1 


Exprrcrep NUMBER OF TRIALS REQUIRED FOR THE 
OcCURRENCE OF AN EVENT 


Expected values are meaningless when considering the 
occurrence of k events in 7 trials. However, we can find 
the expected number of trials required for the kth event 
to occur. 

The generating function for the kth occurrence of an — 
event at the nth trial is Wo(t) W;(t)"’; call this Z(t). Then 
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d/dt Z(t),-. = expected number of trials required 
for k events to take place, when (26) 
the process returns to state 7 after 
the occurrence of each event. 


Differentiating Z(t) with respect to ¢ and substituting 
= 1, we obtain 


Ll Se | ea ely OY 
ae iret (e 7h) Bai (27) 
Finally, we will examine the occurrence of events in the 
vo extremes of the process, that is, when the events are 
dependent, (¢ = 0), and when there is complete over- 
ipping, (¢ = j — 1). The expected number of trials 
squired for k& occurrences are, respectively, 


hes (28) 


TP + (k= 1). 

gp P 
If we consider that j (the number of elements in an 
vent) is large enough to make p’ negligible with respect 
9 1, and that k itself is large, then these two equations 
re approximated by 


(29) 


1k 

Sets 30 

eo (30) 
od 

k 

=. 31 

< (31) 


Thus, in the extreme cases, overlapping events occur, 
n the average, 1/qg times more frequently than independ- 
nt events. 


_ BEGINNING OBSERVATIONS AFTER A PROCESS HAS 
| STARTED 
| 


In the previous sections, we assumed that the process 
egan at n = 1. Now we will consider that the process 
as been in operation for a sufficient number of trials so 
hat even at the first observation, the event might occur. 
‘his implies that a portion of the sequence is stored in a 
nemory and each observation samples the memory to 
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see if the event has just occurred. Regardless of when we 
begin the observations, they continue, one for each stage 
of the process. Consistent with this extension, we define 


a(n, k) = the conditional probability that in the next 
n trials the event occurs k times, given that 
the process has been in operation a sufficient 
number of times so that it is possible for the 
event to occur at the first trial that is 
observed. 


The value of x(n, k) can be obtained by summing the 
conditional probabilities of making k events in 7 trials 
from each of the possible states that the process could 
be in, multiplied by the probability that the process is in 
that state. Thus, 


w(n, k) = gPo(n, k) + (p — p*)Piln, k) + 
(pe Pan) (32) 
= ¢{Po(n, k) + pPin, bk) + +: 
ae py iP NG k)}; 


a(n, k) can also be written in terms of P, alone. 
It should be apparent that all the “states of beginning” 
between R(n, k) and x(n, k) can also be developed. 


CONCLUSIONS 


Functional equation techniques are easily applied to 
the problem of recurrent events, and simple compu- 
tational equations are derived for’ obtaining the prob- 
ability density function of the number of recurrences of 
the events. The techniques lead directly to the generating 
functions of the events, which have many more uses 
besides obtaining the rates at which the different processes 
develop. 

These results are of value to workers in the fields of 
automatic control, information theory, and radar. Of 
greater significance, however, are the methods employed 
to obtain these results. They allow an increased breadth 
of approach to all sequential problems. 
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Direction of Change with Refine- 
ment for Unweighted and Weighted 
Information-Entropy Functionals* 


The information theory literature! con- 
tains numerous statements as to the sign 
to be prefixed before >>; p: log p; and 
>: p: log p:/q; in order to get information 
or entropy. In this note, we find the direc- 
tion of change of generalized functionals 
with refinement of partitions of the under- 
lying space. We suggest that the functionals 
be called information or entropy functionals 
according as they increase or decrease with 
refinement, and that the signs be chosen 
accordingly. The unweighted and weighted 
functionals behave oppositely in this re- 
spect. 

Let 2 be a space and F a countably 
additive class of subsets of 2. Let wo be a 
measure, that is, a non-negative countably 
additive set function defined on /, and 
let Ku A) = k log po(A) for each A e F, 
where k is a constant scale factor. If B ¢ F, 
0 < p(B) < o, and z is a countable dis- 
joint partition of B into F-sets, we define 
the unweighted or a posteriori* functional 
Lum) = k/u( B) 2 w(A) log w(A), as 

tT 
the expectation of K,, with respect to po 
over the partition xz. Note that po(A) < © 
for each A e 7 and that wo(A) = 0 contrib- 
utes nothing to the sum above. 

To define functionals of the weighted or 
a priori4 type, we consider a non-negative 
finitely superadditive set function » in ad- 
dition to the measure po. That is, u(A) = 0 
for each A e F and uw(Ai U Ao) 2 w(A1) + 


p(Az) for Ay \ Az = O, Ai, Ao ¢ F. Then 
we form 
A) 
K,,,(A) = k log 24 
Mu =r ) 2 u(A) 
and the expectation 
Mo( A) 


L,, u(r) = me Ho( A) log u(A) 


of K,,., With respect to uo, where as before 
7 is a countable disjoint partition of B into 
F sets. 

Let »(A) = I]. »(A) be any non- 
empty product of measures vy e M. If MZ has 
two or more members, then p is only super- 
additive, for 


* Received by the PGIT, December 8, 1948. 

1C.E. Shannon and W. Weaver, ‘‘The Mathe- 
matical Theory of Communication,’’ University of 
Illinois Press, Urbana, IIll.; 1949. 

2 L. Brillouin, ‘Science and Information Theory,” 
Academic Press, New York, N. Y.; 1956. 

3 P,. M. Woodward, ‘‘Entropy and negentropy,” 
IRE Trans. on INFORMATION THEORY, vol. IT-3, 
Dp. 3; March, 1957. 

‘P.M. Woodward, perce uby and Information 
Theory, with Applications To Radar,’’? McGraw- 
Hill Book Co., Inc., New York, N. Vals 1954, 


uA, ) A;) = if v(A, Ne) As) 


veM 


I] p(A) + (A,)] 


veM 


= ia v(A,) + ie y(A,). 


ve M veM~ 


If Q = % X OM, 
My(A,) == bo( Ax x Qs) , 
bo( A») = Ho( Qh x A,), 


and yp is the product measure w: X pe, then 
L,,u(7) tends to the Kolmogorov infor- 
mation® as z is increasingly refined. 

There are two essentially different situa- 
tions possible for a measure wo with count- 
able disjoint partitions. In the discrete 
(atomic) case, there is a finest partition 
Tmax such that 0 < p(B) < o@ for each 
B ¢ mmax. In the continuous (atomless) case, 
there is no such partition. (A partition 7’ 
is said to be a refinement of z, in case each 
B <7 is a countable union of sets B’ « x’). 

We now derive some elementary inequali- 
ties. Let 


Ba, ra aie? a,,) 
- (Se) mee) 
t=1 ~=1 
— (a; log a,) 
7=1 
Then 
B(a,, ‘ » Gn) On+1) 
(Gi Oe (1) 
for ay, a ; On+1 = 0. 


To prove this, note® that 


at ae i 
Pd, a) = 5 i} log (1 te ) 


“log (1 Il ae 8) de 2 0. 


By induction, it is easy to show that 


G7) ly ) An) 


= ofS a;, a fOn Sai eewee 
t=1 


&(a,, mieina &(a,, eto 


5 A. N. Kolmogoroy, “On the Shannon theory of 
information transmission in the case of continuous 
signals,’ IRE Trans. ON INFORMATION THEORY, 
vol. IT-2, p. 103; December, 1956. 

6D. Bierens de Haan, ‘“‘Nouvelle Table des Inte- 
grales Definies,”’. Stechert, New York, N. Y., table 
33, no. 5, p. 59; 1939. 
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If {az} is a non-negative sequence, we 
define 


P(a,, ayy ) 


as 


lim P(a,, OPK to Oe OP 


no 


-_ = 


which exists and is non-negative by (1). 
This suffices for the unweighted case. 

The more complicated weighted case is 
attacked in two stages. First, let 


yy 


V(a,, aewe 5 Gas b, hens: i) 

n Sie 

= eo: a.) log |-> 
i=1 ys b; 
n a; 

pe (a, log as), 
for | 
Oi Oc a0 Ui eee 


By induction, we can show that 


W(a,, ai se’ One 5 b:, se Orga) 
= W(a,, 6.550 G5 b,, eee b,.) 
n n 
ra (> a3, An4+1; De b, bes) 
i=1 7w=1 
for n 2 2. By a semi-geometric argument, 


we can show that W(di1, a2; bi, bs) S 0. 
Clearly the surface z = ®(z, y) is a convex 
cone with vertex at the origin. Any tangent 
plane touches the surface in a ray whose xy 
trace is baw — boy = O for some b;, by > 0, 
and conversely. The corresponding tangent 
plane is 


(bt be) 
z= 2 log ze 
1 


+ vie (*E2), 


Since the surface is convex, every tangent. 
plane is an upper bounding plane, so that 


B(a,, 2) — a log (be) 

b, 

Rare @ 4 B e 
2 
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Clearly then 
| 
Wa, 4 


. 


Va, sites! 


: Gavan Ox, abs 


Sot gs 0) eek A) 


» An; b,, i 


Hence W(a1, a2, ++ ; bi, be, ¢*+) = limnye 

RECOM Die “On exists ands 0 

for sequences {ax}, {bs}, ae = 0, bg > 0. 
Second, let 


Va, oes Qn} b,, S Poa Des 


and note that 


Wa, vee Ont+15 O;, ‘tek Oe Cee) 


a W(a,, eee An; Or, aC oO Oee,) 


+ o> GEORG R a5 bnvs) 


i=1 


a So a.) log (@ + bast) 


Thus ¢n41 2 Cn + bn 41 implies 


BEG Ons Di, @* > Bier} Cua) 
Sai dete, oe bas). 3) 
Since 


Va, dz; bi, bz; ¢) 


| = V(a,, a2; bh, i) 


+ (a, + a,) log (te), 


Cc 


we have W(ai, de, bi, b2, ¢) S O whenever 
c = bi + be. Hence if cz = bi + be, Cn = 
Cn + basi for each n = 2, then 


V(a,, Oi OS 


D:; Ose sie Cs Ca, +++) 
= lim V(a,, y ? An; b;, as rs ys i) 
no 


exists and is S 0. 
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The elementary inequalities yield in- 
equalities for the unweighted and weighted 
functionals as partitions are refined. 

Theorem 1: Let x’ be a refinement of 7. 


If 
k=O, 
then 
Lyn’) = L,,(n) 
and if 


k > 0, L,,(r’) S L,,(m). 


Proof: li r = {Bi}, x’ = {C;'} are par- 
titions of A e F, then for each 7 there is a 
set NV; of indexes such that B; = U;.n;C;’, 
by definition of refinement. From (1), 


OS Dd) mo(Ct) log Dd) wo(C?) 


TENG 7eNi 


2. Ho(C4) log wo(C4) 


ieNi 


I 


Ho(B;) log po(B;) 


DS Ho(C) log po(C%). 


ieNi 


Summing over 7 yields 


0 < >> w(B,) log uo(B,) 


4 


cs pS Ho(C7) log wo(C) 


so that fork > 0, 


Lule’) = egy X wolC') Jog wal 
5 ry E mlB,) log wo(B) 


= L,.(m) . 


This is to be expected. However, in- 
tuition about the weighted functionals 
Ly... is weaker. The yo dependence of 
Luo» Would suggest the same behavior as 
that of L,, and the » dependence would 
suggest the opposite behavior. 
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We now prove that if the weighting set 
function yw is super-additive, then the 
weighted functional J,,,, behaves op- 
positely to L,, for a given choice of k. 

Theorem 2: Let w be a non-negative 
super-additive set function. If k < 0, then 


Viren Gs) S Lice 


and if 


k = 0, [brent es) = Lie ane 


Proof: Write {C;’ : jeN;} as {D¢,;}. Let 
ee TAD 
a; Mo\; 


Cire (U Di) 


k=1 


b; = w(D}”), 
Note that 
Ce = (Dy SD ea 


“F u(D3”) = 6b, + be, 


Cn+1 


= (Uo? u Ds) 
k=1 


IV 


AU De) + 108 
k=1 


= Cy ain Oa 


since the partition D;™, of B; is disjoint. 
Moreover 


w(B) = w(\U Di?) = w(U D}?) 


us AU By) > AU ig 


7=1 7=1 


so 


w(B,) 2 lim (U Di 


NWO 7=1 
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Hence by (3) and 
the continuity of x log #/y for x, y > 0, 
we have 


ie Mla Gs, oe 

b,, Doe SOLE OE hy +++) 
> toe bo(D 

(= wo) 

-log 

(tim (UV D\)) 
— E wlD$?) tog 2 
> uo(B,) log a : 

i po(D;”) - 

- x po(D;' 2) log u(D®) 


Summation over z and multiplication by 
k/o(A) gives the stated result. 

The above results can be generalized to 
conditional functionals. If, as we suggest, 
information is to be a nondecreasing func- 
tional of partition refinement, then any 
non-negative linear combination of Ly, 
aes (with negative constants) and 

uot) Luu’) *** (with positive constants) 
for measures po, wo, -** and superadditive 
set functions p, wu’, would supply a 
suitable information functional. On the 
other hand, any non-negative linear combi- 
nation of Ly, Lu, °** (with positive con- 
stants) and Ly,,u, Lu.’,u', °° * (with negative 
constants) would supply a suitable entropy 
functional. 

We now make an elementary application 
of these remarks. Let £, 7 be discrete ran- 
dom variables, and let 


Px == P(é = £7 = ni) 5 
H(é, n) = =e Pas log P;;. 


The functionals H(z), H(n), H(é|n), H(n|2), 
and R(é, 7) = A(é& 1) — H(é) — A(n) 
are defined as usual. From the point of 
view expressed above, H(é, 7), — H(é|n), — 
H(n)é), H(é), H(n), and —R(Eé, n) have the 
nature of information. Kolmogoroy® calls 
I(é, n) = —R(é n) the quantity of in- 
formation in & relative to 7. 

From the monotone character of L,, 
and L,,,,, and the fact that disjoint count- 
able (or disjoint finite) partitions of a set 
form a directed system? (net) with respect 
to ordering by refinement, it follows that 
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limits with respect to partition refinement 
always exist and define set functions. 
In the discrete (atomic) case, the maxi- 
mal partition furnishes the limiting value. 
In the continuous (atomless) case, the 
limits are nontrivial. Under suitable re- 
strictions on po, » the author has shown 
that the limiting set functions L,,,, possess 
the properties usually ascribed to infor- 
mation and entropy, and are finite. This 
generalizes some results of Kolmogorov® 
and his associates. 
R. Lerentk 


Michelson Lab. ~ 


U.S. Naval Ordnance Test Station 
China Lake, Calif. 


7J. L. Kelley, D. Van 


“General Topology,” 
Nostrand Co., Inc., 1956, 


New York, N. Y.; 


Correction to ‘‘A Note on Angle 
Modulation by a Mixture of a Peri- 
odic Function and Noise”’ 


P. R. Karr, author of this Correspond- 
ence item which appeared in the Septem- 
ber, 1959, issue of these TRANSACTIONS, 
has called the following to the attention of 
the Editor. 

The sentence immediately after (13) 
should read: “The latter approximation is 
not inconsistent with the former.” 

In (3), wo should be w. 

Above (6), the symbol X — © should 
be deleted. 


Further Comments on ‘‘A Markoff 
Envelope Process’’* 


In an article by Pierce! it was shown that 
when a singly tuned RLC filter is excited 
with white Gaussian noise, the envelope of 
the resulting narrow-band noise constitutes 
a first-order Markoff process. A concise 
alternative proof of this same fact, based 
on use of the Fokker-Planck equation, was 
given by Helstrom in a recent article, and 
a third proof, essentially intuitive in char- 
acter, was offered by Isley”. This last proof 
contains a number of apparent errors which 
are summarized in what follows. We will 
take “Markoff” to mean first-order Markoff 
in the remaining material. 


* Received by the PGIT, August 9, 1929. 

1J. N, Pierce, ‘““A Markoff envelope process,” 
IRE Trans. on InrorMATION THEORY, vol. IT-4, 
pp. 163-166; December, 1958. 
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In the third paragraph of Isley’s letter,? 
it is stated that ‘Since the envelope de- 
tector is a zero memory device, the dik 
mensionality of the process at its output 
cannot, by dint of definition, exceed the 
dimensionality of the process at its input.” 
This statement is in error for two reasons: 
First, the implied definition is incorrect 
since it is necessary that the nonlinear de- 
vice operating on the input be not only 
zero-memory but also one-to-one. Second, 
the envelope detector, considered as a 
physical device, is neither zero-memory 
nor one-to-one. An example is given in 
Appendix I showing that the square of a 
discrete Markoff process need not be 
Markoff. A similar proof, due to Price’ 
and not reproduced here, shows that the 
square of the nonzero mean Markoff 
Gaussian process is not Markoff. These 
two examples are sufficient to show the 
necessity for a monotonic transformation 
of the input to preserve the Markoff 
character. That the actual envelope de- 
tector is not memoryless or monotonic is 
obvious from the fact that it must have a 
memory extending over at least a few 
eycles of the center frequency, and from 
the fact that it is insensitive to the actual 
phase of the input signal. 

As was explicitly stated by Pierce,‘ the 
in-phase and quadrature components of 
the given narrow-band signal are them- 
selves Markoff, and statistically identical. 


In the latter part of the third paragraph ~ 


of Isley’s letter it is suggested that it then 
follows trivially that the envelope is also 
Markoff. This is not true since examples 
can be constructed showing that the length 
of a two-dimensional Markoff vector 
(having statistically identical, independent 
orthogonal components) need not be 
Markoff. Such an example is given in 
Appendix II. Alternatively, one might 
base the demonstration on the fact that 
the square of the zero mean, Markoff 
Gaussian process is also Markoff;? however, 
it would then be necessary to show that 
the sum of two such squared components 
was also Markoff. This is not trivial either, 
since it is not true in general that sum 
processes are of the same order as the 
components. An illustration of this is 
again to be found in the example of Ap- 
pendix IT. 

The interesting conjecture about higher 
order Markoff envelope processes made in 
the last paragraph of Isley’s letter may 
quite possibly be true; it would certainly 
require a more rigorous proof than was 
presented in the letter. 

It should be pointed out that there is a 
typographical omission in the original 
article.6 Eq. (19) should have in the de- 
nominator of the right-hand side, the 


eee ee 


multiple product appearing on the right- — 


hand side of (18). 


2C. W. Helstrom and C. T. Isley, ‘‘Two notes on 
a Markoff envelope process,’ IRE Trans. on In- 
FORMATION T'nRORyY, vol. IT-5, pp. 139-140; Sep- 
tember, 1959. 

3 R. Price, M.1.T. Lincoln Lab., Lexington, Mass., 
unpublished material. 

4J. N. Pierce, op. cit., p. 16 

5 Pierce, op. cit., p. 165, dss, 


APPENDIX I 


Non-Marxkorr SQUARE OF MARKOFF 
PRocEss 


Let 2x:, 22, -** , 2m, be a stationary dis- 
rete Markoff process, where z can assume 
he values —1, 1, and 3. Let the matrix 
f transition probabilities, p(am41 | &m), be 
iven by 


Umtl 
=f) i! 3 
—1) l1—a-a’ a ae 
eaaiat k a 1— 2a a 
3 ar a l-a-av 


We find immediately that p(—1) = p(1) = 
ip Letiy,, = an2, and let p,(a1, 20, -:* , tm) 
denote the probability of a particular 
sequence of x values, and let p:(tm4 | «1, 
£2, *** , Xm) denote the conditional proba- 
bility of a particular value of 241. 

Then, for example, we have 


p.ll, 1,9 wee (CE 9) 

| + p.(1, -1, 3) 
+ p(—1, 1,3) 
TV laeme ls 3) 

= ja(1 — a’) 
Pate pl pL, 1) 

a Dla lat) 
rel ees) 


= 3(2 + a)(1 — a) 
p,(1, 9) = p.(1, 3) 
+ p.(—1, 3) 
= }a(l — a) 


pi(1) + p.(—1) 


p(1) 


cosa 


Tt follows, then, that 
Cee) 
pyc) 
atata’ 
2+a 
pA, 9) 
p,(1) 
ata f 


2 


pao ly 1) 


p9|1) = 


Correspondence 


Since p,(9 | 1, 1) # py (9 | 1), fy} is not 
Markoff. 


APPENDIX IT 


Non-Markorr Sum or MARKOFF 
Processes AND Non-Marxkorr LENGTH 
oF Markorr VECTOR 


Let x1, 2», MARS IRN Ns 2 oa , and 1, Ue Baars) 
Ym, °*** , be two statistically independent, 
statistically identical stationary Markoff 
sequences, where x, y can assume the values 
0, 1, and 2. Let the transition probability 


matrix be 
0 1 B 
0} 1 — 2a a a 
1 a 1 — 2a a 
Bie 6 a 1 — 2a 


The x (or y) values are then equiprobable. 
Wetez, zo ere ee bes themsum: 

sequence with Zn = %m + Ym. Using the 

same notation as in Appendix I we find 


p.(0, 2,0) = p,(0, 2, 0)p,(0, 0, 0) 
pO; 15 Op (Oy a1 5.0). 
+ p,(0, 0, 0)p,(0, 2, 0) 


De i ae Sa 
9 


p.(0, 2) = p.(2, 0) 
= p(0, 0)p,(0, 2) 
+ pO, 1)p,(0, 1) 
+ p.(0, 2)p,(0, 0) 


2a — 8a 
9 


col 
° 


p:(2) 
We then have 


p(0, 2, 0) 
p.(0, 2) 


De a Sal’ 
2a — 3a° 


p:(0| 0,2) = 


p(2, 0) 
p.(2) 


2a — Bar 
3 


p.(0 | 2) 
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Since p(0 | 0, 2) # p-(0 | 2), the sum pro- 
cess is not Markoff. 

Now let u = VWz,0 = Vy, w = Vz: 
If we identify uw and v with the orthogonal 
components of a two dimensional vector, 
then w is the length of the vector. It then 
follows that the length need not be Markoff 
even though the vector is. 
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Reply by Mr. Isley* 


The term “envelope detector’? in the 
third paragraph of Isley® was intended to 
apply particularly to the equivalent de- 
tection process which yields the envelope 
function assumed by Pierce.! 

It is recognized that, in terms of actual 
physical conditions, envelope detectors or 
perhaps more properly, envelope detection 
processes introduce at a minimum an in- 
crease of one dimension in the process 
even though the nonlinear element proper 
may have zero memory. It may be re- 
marked that the analysis advanced by 
Pierce is accordingly only an approxima- 
tion, since he has neglected the memory 
introduced by the post detection filter 
associated with the nonlinear element. It 
might be intuitively argued that this ap- 
proximation is valid if it is postulated that 
a suitable postdetection filter is available: 
a filter having negligible effect on the so- 
called envelope components while provid- 
ing an arbitrarily high suppression of all 
other components out of the nonlinear 
device. It may also be noted that the as- 
sumption regarding orthogonality of the 
“envelope”? components also introduces an 
approximation if a rigorous mathematical 
interpretation of the assumed physical pro- 
cess is required. 

Practical physical conditions notwith- 
standing, the process postulated by Pierce 
may be readily established as a mathe- 
matical model. In such a model, the “en- 
velope detector’? might be a three-port 
device having two inputs and one output. 
The orthogonal components of the input 
process are respectively applied to the 
inputs. These component processes, as 
previously noted, are independent, sta- 
tistically identical, and two-dimensional 


* Received by the PGIT, September 28, 1959, 
6 Helstrom and Isley, op. cit. (see the third para- 
graph in the second note.) 
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Gaussian. The detection process is as 
follows: each input component is squared; 
the resulting squares are linearly summed; 
the square root of the resulting sum is 
effected. For such a model, the detection 
process obviously introduces no memory. 

The fact that samples of the envelope 
function derived from the components of 
the input process constitute a first-order 
Markoff process is readily demonstrated 
as follows. 

It may be noted that the components 
of the input process X(t) and Y(t), re- 
spectively, may be expressed as: 


t 


XG) = [ eo "— (4) dr 


¥() = fee" V(r) dr 
0 


where U(t) and V(t) are independent, 
statistically identical, first-order Gaussian 
processes. 

It follows that, for ty > ty, 


X (ty) = pty, tw—1) at by ,w—1X (ty-1) 
Y(ty) = g(tw, ty-1) + bw,w-1 Y(ty-1) 


where: 


p(ty, ty-.)= | 


tn—-1 


tn 
Ce a) ar 


tn 


Cag VG); 


tn-1 


q( tw, ty—1) = 


—_ p,-a(tn—-tn—1) 
by n-1 =e 


X(ty-a)= [ 


Vil) oO OV) de 


tn-1 


6 MOSSE Ua dr 


As a fact of fundamental significance in 
the conclusion to follow, it should be noted 
that the memory in both of the Gaussian 
processes defined respectively by the func- 
tions p(ty, ty-1) and q(ty, ty-1) 1s confined 
solely to the epoch ty > ¢ > ty. 

Noting that: 

X(ty), Y(ty) are the rectilinear compo- 
nents of the two-dimensional vector 


mead 
Ry, 
X(ty1), Y(ty-1) are the rectilinear 
components of the two-dimensional 


=> 
vector Ry_1, 


and defining 
p(ty, ty), g(ty, ty-1) to be the recti- 
linear components of the two-dimen- 


: Lg 
sional vector Wy,yv-1, 


the following vector relationship may be 
obviously established: 
=> 


— ~ 
Ry == by why Ae Wy,w-1- 


It then obviously follows that the joint 
probability distribution 


ply, Rin-1, Oy ,n-13 Ty ,N-1) 
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where 


T™N,N-1 — tie ty-1 
—_ 
Ry ae Ry/ On 


— 
Ry-1 = Ry-1/On-1 


constitutes a complete set of statistics for 


> > 

the process {Ry, Ry_1}. The fact that the 
marginal distribution p(Ry, Ry-1; ty, w—) 
establishes the process {Ry, Ry} to be 
Markoff follows trivially. 

The examples cited in Pierce’s comments 
are interesting, but do not appear relevant 
to the processes considered in Pierce’s 
paper and Isley’s note. 

C. T. Isrey, 
Hughes Aircraft Co. 
Los Angeles, Calif. 


Letter from Mr. Pierce* 


Mr. Isley is to be complimented for 
suggesting a more direct form. I would 
like to offer an even shorter proof which 
I believe contains the substance of his. 

Let the subscripts 1, 2, - , N corre- 
spond to time instants i < kh < ++: < ty. 
Let R and @ be the amplitude and phase, 
and let x and y be the inphase and quadra- 
ture components of the previously described 
narrowband process. Then 


p(Ry Vise Jie, ay Ry-1) 
= p(y | lige 15 ele yess Ones) 
TOmanven O24 (1) 


= p(Ry | Lei, Le ee ae) 


(2) 
= p(Ry | tw-1, Yn-1) (3) 


= p(Ry | Ry-4). (4) 


(1) follows from the fact that the envelope 
distributions are independent of arbitrary 
fixed phase shifts of the process. Selecting 
any particular value for the phase angle 
at the immediately previous sample point 
fixes the value of the inphase and quadra- 
ture components at that sample point as 
indicated by (2). But since x and y are 
known to be statistically identical zero 
mean Markoff processes, all available in- 
formation about xy and yy (and hence Ry) 
is given by knowledge of zy_1 and yy_1 
as implied by (3). Finally, since the choice 
of 6y-1 was completely arbitrary, zy, and 
yy-1 are determined solely by Ry_1 which 
gives the desired result (4), and completes 
the proof. 


* Received by the PGIT, October 10, 1959. 
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This proof can not be extended to higher 
order Markoff processes. To do this would 
require an assumption of arbitrary phase 
angles at two or more previous sample 
points; this assumption is not permissible. 

The same type of proof can be used to 
show in a simple fashion that the square 
of the zero mean Gaussian Markoff process 
is also Markoff: 

Let x be the zero mean Gaussian Markoff 
process, let » = 2, and let s = | a |/z. 
(s is plus or minus one corresponding to 
positive or negative x.) Then 


pWwn | %1, 2, *** , Yv-1) 
= ply | 01,02, *** ,Yx=1, 8w-1) (5) 
= pw | 1%, +**  Zw-1) (6) 
= pry | ty-1) (7) 
= pw | Uy-1)- (8) 


The supporting reasoning takes an identi- 
cal form to that following (1)-(4). 

It is also obvious that whether or not 
the phase angle is Markoff (for the narrow 
band process) this type of proof can not 
be used to demonstrate the fact. The as- 
sumption of an arbitrary envelope sample 
value is equivalent to assuming an arbi- 
trary gain factor and consequently assum- 
ing a new, and unknown, mean power for 
the narrowband process. 7 

J. N. Prerce 


Error-Correcting Codes for an Asym- 
metric Nonbinary Channel* 


In transmission of messages with pulse 
amplitude modulation through a noisy 
channel with no memory, detection and 
correction schemes of small errors [(-+1)- 
errors] for a symmetric nonbinary channel 
have been extensively studied in the liter- 
ature.!~4 However, it is observed that the 
intrinsic transmission characteristics in 
some physical devices as well as in some 
communication systems are of an asym- 
metric nature. For instance, one example 
is a magnetic tape storage unit in a large- 
scale computer;> another is the dissimilar 
fading on mark and space channels which 


* Received by the PGIT, October 19, 1959. This 
work was supported by Natl. Science Founda- | 
tion Grant G-3676. 

1M. J. E. Golay, ‘Notes on digital coding,” 
Proc. IRE, vol. 37, p. 657; June, 1949. 

2 W. Ulrich, ‘‘Non-binary error correcting codes,” 
Pau Sus Tech, J., vol. 36, pp. 1341-1388; November, 
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3C. Y. Lee, “Some properties of nonbinary 
error-correcting codes,’ [RE Trans. on INFORMA- 
TION THEORY, vol. IT-4, pp. 77-82; June, 1958. 

4H. S. Shapiro and D. L. Slotnick, ‘“‘On the 
mathematical theory of error-correcting codes,” 
eer J. Res. and Dev., vol. 3, pp. 25-34; January, 

5K. G. Newman and L. O. Nippe, “Simulation 
of an Information Channel on the IBM 704 Com- 
puter,” IBM Tech. Publication, Tr 00.01000.677; 
January, 1959. 
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ss a type of operation that is sometimes 
sought in scatter circuits.® 
| In the previous work,’7® a correcting 
ode was developed to. correct only one 
fype of error, l-error or O-error, for an 
Asymmetric binary channel. As an exten- 
sion of the work, a correcting code is 
proposed to correct only one type of small- 
prrors,? (+1)-errors or (—1)-errors, for a 
honbinary channel which has a transmis- 
sion characteristic such that the probability 
f (—1)-error is much greater than that 
f (+1)-error or vice versa. This assump- 
jtion may easily be implemented in practice 
by setting the position of brushes for a 
isk code generator,? or by setting a bias 
r threshold level. 
Let us take an example of a ternary 
hannel shown in Fig. 1, where 
= probability of (—1)-error 

(mod 3) 

8 = probability of (+1)-error. 


and 


i a Bs 


Fig. 1—A ternary channel. 


Referring to Fig. 1, if a; = 8; for all 
2 and j, then the channel is said to be 
symmetric. Otherwise, it may be called an 
asymmetric channel. The type of channel 
considered in this paper is a uniformly 
asymmetric channel which has transmission 
characteristics as follows: 


for all 7 and 7. (1) 
a; > B; 


(or a; < B:) 


Let k be a positive integer, with k > 2, 
and assume that each letter? (or each digit) 
of an n-length word takes k-values 0, 1, 
24, , k — 1. Then, it was shown by Lee’ 
hat tre distance (e circular metric), be- 
tween two words s = (S1, S, °** , S&) and 
=e (Gyo - , ¢,), required to correct 
e-tuple (+1)-errors is 


6 The author was introduced to this problem in 
correspondence with B. B. Barrow, SHAPE, Air 
Teen: Moree The Hague, The Netherlands. 

. H. Kim and C. V. Freiman, ‘“‘Single error 
ere codes for asymmetric binary channels,” 
IRE Trans. on InroRMATION THEORY, vol. IT-5, 
pp. oe 66; June, 1959. 

. H. Kim and C. V. Freiman, ‘‘Multi-error 
eee codes for an asymmetric binary channel,” 
Trans. 1959 Internatl. Symp. on Circuit and Informa- 
tion eh pp. 71- 78; June, 1959. 

9K Gilbert, “Gray, codes and paths on the 
Baths: n Se Sys. Tech. J., vol. 37, pp. 815-826; 
May, 1958. 


Correspondence 


BOO. = 2 p(8;, t) > 2e+1 (2) 


where p(s;, ¢;) = Min.{s; — t;, ts — s;} 
mod k. However, if a channel exhibits a 
highly asymmetric transmission charac- 
teristic, it may sometimes be sufficient 
to correct only one type of small error 
{((+1)-error or (—1)-error] and the dis- 
tance requirement (2) may be considerably 
relaxed. The nature of the relaxation may 
be clearly seen in the case where a channel 
possesses a highly asymmetric uniform 
transmission characteristic.73 The weight- 
distance requirements for a pair of words 
of an e-tuple (—1)-error [or (+1)-error] 
correcting code are found to be, for n > 2, 
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suffix 


ri Oe] 
where x; ¢ Sx(r), c; = Ci(n, 3), 4 = 1, 2, --:, 


prefix 


aes foralliandj (5) 


kt js =e ee 7 OG, BD) Or GAD, 3) 
andr = n/2forn = evenandr = (n — 1)/2 
for n = odd. When «x; and c¢; are of different 
lengths, the left-most position is aligned 


before addition. 

The proof that the generation scheme 
(5) satisfies the weight-distance require- 
ment (4), is given in the Appendix. It is 
shown in the proof that the proposed 
scheme (5) satisfies the requirement (4) 
for all values of k, where k > 2. Several 
examples are given for k = 8 for illustration 
and a k-valued code for k > 3 will follow 
the same generation scheme. 

Example 1: n = 5, and k = 3. Then, 
prefixes are all elements in §;(2) and 
suffixes are generated from prefixes by 
performing the operation of position-by- 
position addition (modulo 3) with each 
element in C;(3, 3), where C;(3, 3) = 
(000, 111, 222]. 


AW\s, #) 
Ser! Serle ta) 
or 
prefixes suffixes prefixes 
S;(2) S3(2) ® 000 S;(2) 
00 000 00 
01 010 01 
02 020 02 
10 100 10 
11 110 11 
12 120 12 
20 200 20 
2 210 21 
22 220 22 
p(s, t) > 2 +1) — AW(s, 4) (8b) 
or 
piss, tne e “when () > 6 292" (se) 


where W(s) = )o:u” s; mod k, and 1 = 
k/2 when k is even and 1 = (k — 1)/2 
when k& is odd. The weight-distance re- 


quirement (3) for asymmetric small-error. 


correction may readily be deduced by 
following the arguments used in the previ- 
ous work®? for k = 


Single (—1)-Error Correcting Codes 


For single (—1)-error correction [or 
(+1)-error correction], (3) is rewritten as 


KW (S,t) = 2-—or 
p(s, t) Ze 


Let us use the following notations: 
S;,(7) : a set of all k* sequences. 


(4) 


Prefix : the first r letters of a sequence 
of length n. 
Suffix : the last (n — r) letters of a se- 


quence of length n. 

: a code which is designed for a 
symmetric nonbinary channel 
where d > 2e + 1. 

@ : position-by-position addition mod k. 

Then, a set of code words may be generated 

by the following scheme: 


Ci(n, d) 


suffixes prefixes suffixes 
S(2). Billie) S82). 222 
111 00 222 
121 01 202 
101 02 212 
211 10 022 
221 11 002 
201 12 012 
O11 20 122 
021 21 102 
001 22 112 


For n < 4, a slight modification of the 
generation scheme (5) will increase the 
size of codes considerably, and they are 
shown in the following examples for n = 3 
and 4. 


Example 2: 

= 3 and k=3 =4dandk= 
prefixes suffixes prefixes suffixes 
0 0 @ 00 00 00 @ 00 
1 1 @ 00 OL 014 00 
2 2 @ 00 02 02 @ 00 
10 10 @ 00 
1 Vow 11 11@ 00 
5 2@ 22 12 12 @ 00 
20 20 & 00 
21 21 @ 00 
22 22 @ 00 
O1 01@ 11 
10 10 @ 11 
11 LiGere 
02 02 @ 22 
20 20 @ 22 
22 22 @ 22 


10 a(n, 3) denotes the size (number of elements) 
of the largest single (+:1)-error correcting code in 
Sk(n) and be(n, 3) the size of the largest single 
(+1)-error correcting group code in Sx(n). (See 
Lee.) 
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The number of elements in a (—1)-error 
correcting code is denoted by hx(n, 2) and 
is given 


h,(n, 2) = k’b,(n — 1, 3) 
or 
k’a,(n — 7,3) for n> 5 


ands hes 2? (6) 
For comparison of an asymmetric code and 
a symmetric group code C;,(n, 3), the size 
of each code is shown below in Table I 
for k = 3 and nm up to 14. 


TABLE I 


THE Size or ASYMMETRIC AND SYMMETRIC 
TERNARY CODES 


Dele s2 Sen Ge 7aSe ON ONT 1201384 
bs(n, 3) 


ha(n, 2) 


1 3 32 32 33 34 35 36 37 38 39 gio gio 


3 5 15 33 34 35 36 36 37 38 39 310 gil 


From Theorems 1 and 2 in Lee’s paper’ 
and (6), Lemma 1 immediately follows. 

Lemma 1: let k be an odd prime. If n + 1 
is not a power of k, 


kh(n, 2) = h(n + 15.2) for 7 > 5. 


If n + 1 is the power of k, then any asym- 
metric code with h(n, 2) words is a close- 
packed code,’® and 


lilny 2) = hin el, 2) for n> 5. 
Remark 1: if values for k and n are so 
chosen that 


b(n, 3) = k’b,(n — r, 3) 
or 
pe b(n, 3) 
b(n — 3, 3) 


then the sizes of both symmetric group 
code C.(n, 3) and the asymmetric code de- 
signed by the proposed scheme are the 
same. 


Multi-Small-Error Asymmetric Correction 


For an e-tuple asymmetric correcting 
ternary code, the generation scheme (5) is 
directly applicable with some modifications: 
divide a given length of a word into (e + 1) 
letters and use an auxiliary code C;(r, 
2e + 1)forn = (e + 1)r +c, wherec = 0, 
1, 2, --: , e. This generation scheme is a 
direct extension of the method used for 
an e-tuple l-error correcting code for a 
binary asymmetric channel.” 

When & > 3, the minimum distance 
requirement (3c) may enable us to generate 
a code in a very simple manner. For an 
example, let k = 5 and n = 8; then a 
double (—1)-error [or (-+1)-error] correct- 
ing code is shown in Example 3. 

Example 3:n = 3, and k = 5. Then the 
the first group of words for a double (—1)- 
error-correcting code is generated as: 
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prefixes stems? suffixes 
0 0@0 0@0 
1 1@0 1@0 
2 200 200 
3 300 300 
4 4@0 4Q0 


Next we will add 2 to each letter of the 
words in the first group, and we have 


0.@2-.0' 0.0) (0D 2asOMO wn Oia 
1@2Q2 £1” HD 2 1 le oe 
DG) 9.2 20192. 2ame tn aaa > 
3@2.3 31353 QO mol sae or 
4G) 2 4.44 4GYo ois 


As shown in Example 3, a sufficient con- 
dition for a pair of words s and ¢ in order 
to correct all single and double (—1)- 
errors when k > 3 and n > 2, is that 
p(SaUte) ee toil ORNs — les Daan ae enee 
fore, the condition that p(s;, t;) > e for 
k > 2e > 4 and n > e, is sufficient for 
e-tuple (—1)-error correction. 

Remark 2: The diophantine equation for 
a close-packed, asymmetric double error 
correcting code is given, for k > 8, 


n! 
n+ 


Re pip 


Some solutions of (7) are found to be, for 
iawn 


I 


90 
k=4 k=4 k=4 (8) 


i) ie = © nN 


x=1 a 9 = 6 . 


The first set of solutions in (8) is shown in 
Example 4. This existence of 4-valued, 
closed-packed codes may be a significant 
feature for asymmetric correcting codes 
since there is no 4-valued, closed-packed, 
double error correcting code for a sym- 
metric nonbinary channel.® 

Example 4: A 4-valued, close-packed 
double (—1)-error correcting code for 
n = 2andk = 4is shown. 


Code Words 00 12 20 32 


Received 03 02 10 Dp 
sequences as- 30 11 23 31 
signed to the 
above code 

word 33 01 13 Ail 


APPENDIX 


Let 2, 7; e S,(7) and ¢;, ¢; = C.(m —7, 3); 
then four distinct letters in a (—1)-error 
correcting code are generated by perform- 
ing the operation of (5) as: 


December | 
(x3, %; @ ) 
(xy, %3 @ ¢;) 


(2;, 2; @®ei), t= 


u = (c;) 2; @): Ue 


fori #j. 

To check the weight-distance require-— 
ment for each pair of letters, we will use 
the following lemmas without proof since 
they may be checked readily. 

Lemma 2: If AW(2i, 2;) = 0 for 7 ¥ J, 
then p(x:, x;) > 2fork > 2andr > 2, 

Lemma 8: Let p(x:, 2j) = di and p(x; ® © 
ci, ; @ e;) = dy. Then, 


p(c:,¢;) = |d, — d,| + 2a 


where 


s 


r(k — 1) 


dy = O15 seers 2 ) 


Ch = 0, Its os , Min (th, de, 


r(k — 1) re — 1) | 
“—- _ 4, 7 ae a. 


From Lemma 2, it is clear that 
p(s, tf) > 4 for AWG,, 7) =0 


AW(s,t) > 2 for AW(a,,2;) =1 


which satisfy the condition of (4). The same 
is true for p(w, v). 

Since ¢; and c; were chosen such that 
p(ci, ¢;) > 3, it is obvious that 


pls, 40) 213) cands= pe; 0) eae. 


The remaining combinations to investigate 
are p(s, v) and p(t, w). A proof that the 
pair of letters s and v satisfy the weight- 
distance requirement for single (—1)-error 
correction will be given. For the letters ¢ 
and u, a proof may be obtained in a similar 
manner. 

Since an auxiliary code C,(n — 1, 3) is 
used to generate suffixes, p(c;, c;) > 3 and 
Ci, C3 © C,(n — 1, 3). Then, we have from 
Lemma 3 


p(c:,¢;) = Id, —d,|+2a>3 (9) 


and p(s, v) = p(xi, j) + ulti + ci, 2] + 
cj) = d1 + da. Now we are going to prove 
that > 


dy ids 2.3m, Ol Oia 


with 7 # j. (10) 


From (9), if d: = 0, then a = 0 and d; > 3, — 
consequently p(s, v) = di > 3. If d& = 1, 
then a = 1 and d, = 2, consequently 
p(s, v) = di + d2 > 3. For d2 > 2, p(s, v) = 
d, + d, > 3, since d; > 1. Hence, a code 
generated by the proposed scheme satisfies 
the weight-distance requirement for single 
asymmetric correction. 


Wan H. Kim 

Dept. of Elec. Engrg. 
Columbia University 
New York, N. Y. 
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